Github messages for voidlinux
 help / color / mirror / Atom feed
* [PR PATCH] New package: zn_poly-0.9.2
@ 2021-11-09  1:24 tornaria
  2021-11-09  1:28 ` tornaria
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: tornaria @ 2021-11-09  1:24 UTC (permalink / raw)
  To: ml

[-- Attachment #1: Type: text/plain, Size: 645 bytes --]

There is a new pull request by tornaria against master on the void-packages repository

https://github.com/tornaria/void-packages zn_poly
https://github.com/void-linux/void-packages/pull/33969

New package: zn_poly-0.9.2
Dependency for sage. I compiled sage-9.4 using this.

I run tuning in my box, once in 64 bit mode, once in 32 bit mode, and hardcoded the resulting tuning files. Won't be optimal for all cpus, particularly for non-x86, but it seems a fair solution and avoids running tuning each time. Tuning won't work for cross compiling anyway.

A patch file from https://github.com/void-linux/void-packages/pull/33969.patch is attached

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: github-pr-zn_poly-33969.patch --]
[-- Type: text/x-diff, Size: 61814 bytes --]

From 1d3c308df5307cad2f353c7b1af2395d495b109d Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Gonzalo=20Tornar=C3=ADa?= <tornaria@cmat.edu.uy>
Date: Sun, 29 Aug 2021 20:05:24 -0300
Subject: [PATCH] New package: zn_poly-0.9.2

---
 srcpkgs/zn_poly/files/tuning-32.c | 444 +++++++++++++++
 srcpkgs/zn_poly/files/tuning-64.c | 860 ++++++++++++++++++++++++++++++
 srcpkgs/zn_poly/template          |  32 ++
 3 files changed, 1336 insertions(+)
 create mode 100644 srcpkgs/zn_poly/files/tuning-32.c
 create mode 100644 srcpkgs/zn_poly/files/tuning-64.c
 create mode 100644 srcpkgs/zn_poly/template

diff --git a/srcpkgs/zn_poly/files/tuning-32.c b/srcpkgs/zn_poly/files/tuning-32.c
new file mode 100644
index 000000000000..18396ce3d444
--- /dev/null
+++ b/srcpkgs/zn_poly/files/tuning-32.c
@@ -0,0 +1,444 @@
+/*
+   NOTE: do not edit this file! It is auto-generated by the "tune" program.
+   (Run "make tune" and then "./tune > tuning.c" to regenerate it.)
+*/
+
+/*
+   tuning.c:  global tuning values
+
+   Copyright (C) 2007, 2008, David Harvey
+
+   This file is part of the zn_poly library (version 0.9).
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) version 3 of the License.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+*/
+
+#include "zn_poly_internal.h"
+
+size_t ZNP_mpn_smp_kara_thresh = 43;
+size_t ZNP_mpn_mulmid_fallback_thresh = 551;
+
+tuning_info_t tuning_info[] = 
+{
+   {  // bits = 0
+   },
+   {  // bits = 1
+   },
+   {  // bits = 2
+         47,   // KS1 -> KS2 multiplication threshold
+       1053,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         43,   // KS1 -> KS2 squaring threshold
+       1053,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         56,   // KS1 -> KS2 middle product threshold
+        689,   // KS2 -> KS4 middle product threshold
+      23040,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         13    // nussbaumer squaring threshold
+   },
+   {  // bits = 3
+         39,   // KS1 -> KS2 multiplication threshold
+        412,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         39,   // KS1 -> KS2 squaring threshold
+        315,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         43,   // KS1 -> KS2 middle product threshold
+        264,   // KS2 -> KS4 middle product threshold
+      23040,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+         12    // nussbaumer squaring threshold
+   },
+   {  // bits = 4
+         39,   // KS1 -> KS2 multiplication threshold
+        901,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         33,   // KS1 -> KS2 squaring threshold
+        901,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         43,   // KS1 -> KS2 middle product threshold
+        185,   // KS2 -> KS4 middle product threshold
+      23040,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         13    // nussbaumer squaring threshold
+   },
+   {  // bits = 5
+         35,   // KS1 -> KS2 multiplication threshold
+        264,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         31,   // KS1 -> KS2 squaring threshold
+        264,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         35,   // KS1 -> KS2 middle product threshold
+        144,   // KS2 -> KS4 middle product threshold
+      21569,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 6
+         33,   // KS1 -> KS2 multiplication threshold
+        247,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         31,   // KS1 -> KS2 squaring threshold
+        173,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         33,   // KS1 -> KS2 middle product threshold
+        116,   // KS2 -> KS4 middle product threshold
+      14044,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         12    // nussbaumer squaring threshold
+   },
+   {  // bits = 7
+         33,   // KS1 -> KS2 multiplication threshold
+        247,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         23,   // KS1 -> KS2 squaring threshold
+        226,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         33,   // KS1 -> KS2 middle product threshold
+        116,   // KS2 -> KS4 middle product threshold
+      12720,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 8
+         27,   // KS1 -> KS2 multiplication threshold
+        123,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         21,   // KS1 -> KS2 squaring threshold
+        112,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         33,   // KS1 -> KS2 middle product threshold
+         94,   // KS2 -> KS4 middle product threshold
+       7753,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 9
+         24,   // KS1 -> KS2 multiplication threshold
+        206,   // KS2 -> KS4 multiplication threshold
+      62020,   // KS4 -> FFT multiplication threshold
+         21,   // KS1 -> KS2 squaring threshold
+        158,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         27,   // KS1 -> KS2 middle product threshold
+         86,   // KS2 -> KS4 middle product threshold
+       9451,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 10
+         25,   // KS1 -> KS2 multiplication threshold
+         86,   // KS2 -> KS4 multiplication threshold
+      68475,   // KS4 -> FFT multiplication threshold
+         17,   // KS1 -> KS2 squaring threshold
+         66,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         24,   // KS1 -> KS2 middle product threshold
+         70,   // KS2 -> KS4 middle product threshold
+       4726,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 11
+         23,   // KS1 -> KS2 multiplication threshold
+        134,   // KS2 -> KS4 multiplication threshold
+      62020,   // KS4 -> FFT multiplication threshold
+         21,   // KS1 -> KS2 squaring threshold
+        107,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         23,   // KS1 -> KS2 middle product threshold
+         75,   // KS2 -> KS4 middle product threshold
+       5393,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 12
+         17,   // KS1 -> KS2 multiplication threshold
+         78,   // KS2 -> KS4 multiplication threshold
+      75603,   // KS4 -> FFT multiplication threshold
+         16,   // KS1 -> KS2 squaring threshold
+         72,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         19,   // KS1 -> KS2 middle product threshold
+         66,   // KS2 -> KS4 middle product threshold
+       4280,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 13
+         17,   // KS1 -> KS2 multiplication threshold
+         86,   // KS2 -> KS4 multiplication threshold
+      56173,   // KS4 -> FFT multiplication threshold
+         14,   // KS1 -> KS2 squaring threshold
+         78,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         19,   // KS1 -> KS2 middle product threshold
+         66,   // KS2 -> KS4 middle product threshold
+       4726,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 14
+         19,   // KS1 -> KS2 multiplication threshold
+         66,   // KS2 -> KS4 multiplication threshold
+      62020,   // KS4 -> FFT multiplication threshold
+         16,   // KS1 -> KS2 squaring threshold
+         62,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         21,   // KS1 -> KS2 middle product threshold
+         66,   // KS2 -> KS4 middle product threshold
+       3877,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 15
+         16,   // KS1 -> KS2 multiplication threshold
+         67,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         13,   // KS1 -> KS2 squaring threshold
+         43,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         17,   // KS1 -> KS2 middle product threshold
+         51,   // KS2 -> KS4 middle product threshold
+       3877,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 16
+         14,   // KS1 -> KS2 multiplication threshold
+         66,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         13,   // KS1 -> KS2 squaring threshold
+         56,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         17,   // KS1 -> KS2 middle product threshold
+         54,   // KS2 -> KS4 middle product threshold
+       3877,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 17
+         14,   // KS1 -> KS2 multiplication threshold
+         47,   // KS2 -> KS4 multiplication threshold
+      50877,   // KS4 -> FFT multiplication threshold
+         12,   // KS1 -> KS2 squaring threshold
+         33,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         17,   // KS1 -> KS2 middle product threshold
+         47,   // KS2 -> KS4 middle product threshold
+       3077,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 18
+         13,   // KS1 -> KS2 multiplication threshold
+         47,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         13,   // KS1 -> KS2 squaring threshold
+         33,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         14,   // KS1 -> KS2 middle product threshold
+         47,   // KS2 -> KS4 middle product threshold
+       3077,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 19
+         13,   // KS1 -> KS2 multiplication threshold
+         31,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         10,   // KS1 -> KS2 squaring threshold
+         27,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         38,   // KS2 -> KS4 middle product threshold
+       2363,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 20
+         12,   // KS1 -> KS2 multiplication threshold
+         43,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         12,   // KS1 -> KS2 squaring threshold
+         33,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         39,   // KS2 -> KS4 middle product threshold
+       2363,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 21
+         13,   // KS1 -> KS2 multiplication threshold
+         32,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         10,   // KS1 -> KS2 squaring threshold
+         25,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         35,   // KS2 -> KS4 middle product threshold
+       1756,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 22
+         12,   // KS1 -> KS2 multiplication threshold
+         31,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         10,   // KS1 -> KS2 squaring threshold
+         24,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         35,   // KS2 -> KS4 middle product threshold
+       2071,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 23
+         12,   // KS1 -> KS2 multiplication threshold
+         24,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          8,   // KS1 -> KS2 squaring threshold
+         21,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         27,   // KS2 -> KS4 middle product threshold
+       1539,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 24
+          9,   // KS1 -> KS2 multiplication threshold
+         28,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          9,   // KS1 -> KS2 squaring threshold
+         23,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         12,   // KS1 -> KS2 middle product threshold
+         29,   // KS2 -> KS4 middle product threshold
+       1756,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 25
+         12,   // KS1 -> KS2 multiplication threshold
+         21,   // KS2 -> KS4 multiplication threshold
+      25439,   // KS4 -> FFT multiplication threshold
+         10,   // KS1 -> KS2 squaring threshold
+         21,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         12,   // KS1 -> KS2 middle product threshold
+         25,   // KS2 -> KS4 middle product threshold
+       1305,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 26
+         10,   // KS1 -> KS2 multiplication threshold
+         19,   // KS2 -> KS4 multiplication threshold
+      31010,   // KS4 -> FFT multiplication threshold
+          9,   // KS1 -> KS2 squaring threshold
+         19,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         25,   // KS2 -> KS4 middle product threshold
+       1349,   // KS4 -> FFT middle product threshold
+          9,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 27
+          9,   // KS1 -> KS2 multiplication threshold
+         17,   // KS2 -> KS4 multiplication threshold
+      28087,   // KS4 -> FFT multiplication threshold
+          9,   // KS1 -> KS2 squaring threshold
+         17,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          9,   // KS1 -> KS2 middle product threshold
+         23,   // KS2 -> KS4 middle product threshold
+       1070,   // KS4 -> FFT middle product threshold
+          9,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 28
+         10,   // KS1 -> KS2 multiplication threshold
+         17,   // KS2 -> KS4 multiplication threshold
+      28087,   // KS4 -> FFT multiplication threshold
+          9,   // KS1 -> KS2 squaring threshold
+         17,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         12,   // KS1 -> KS2 middle product threshold
+         21,   // KS2 -> KS4 middle product threshold
+       1070,   // KS4 -> FFT middle product threshold
+          9,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 29
+          8,   // KS1 -> KS2 multiplication threshold
+         17,   // KS2 -> KS4 multiplication threshold
+      20868,   // KS4 -> FFT multiplication threshold
+          7,   // KS1 -> KS2 squaring threshold
+         14,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          9,   // KS1 -> KS2 middle product threshold
+         21,   // KS2 -> KS4 middle product threshold
+        970,   // KS4 -> FFT middle product threshold
+          9,   // nussbaumer multiplication threshold
+          8    // nussbaumer squaring threshold
+   },
+   {  // bits = 30
+          8,   // KS1 -> KS2 multiplication threshold
+         27,   // KS2 -> KS4 multiplication threshold
+      28087,   // KS4 -> FFT multiplication threshold
+          7,   // KS1 -> KS2 squaring threshold
+         23,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          9,   // KS1 -> KS2 middle product threshold
+         27,   // KS2 -> KS4 middle product threshold
+       1305,   // KS4 -> FFT middle product threshold
+          9,   // nussbaumer multiplication threshold
+          8    // nussbaumer squaring threshold
+   },
+   {  // bits = 31
+          9,   // KS1 -> KS2 multiplication threshold
+         23,   // KS2 -> KS4 multiplication threshold
+      28087,   // KS4 -> FFT multiplication threshold
+          8,   // KS1 -> KS2 squaring threshold
+         17,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         10,   // KS1 -> KS2 middle product threshold
+         23,   // KS2 -> KS4 middle product threshold
+       1070,   // KS4 -> FFT middle product threshold
+          9,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 32
+          8,   // KS1 -> KS2 multiplication threshold
+         25,   // KS2 -> KS4 multiplication threshold
+      28087,   // KS4 -> FFT multiplication threshold
+          8,   // KS1 -> KS2 squaring threshold
+         19,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          8,   // KS1 -> KS2 middle product threshold
+         27,   // KS2 -> KS4 middle product threshold
+       1539,   // KS4 -> FFT middle product threshold
+          9,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+};
+
+// end of file ****************************************************************
diff --git a/srcpkgs/zn_poly/files/tuning-64.c b/srcpkgs/zn_poly/files/tuning-64.c
new file mode 100644
index 000000000000..34ac692c83b0
--- /dev/null
+++ b/srcpkgs/zn_poly/files/tuning-64.c
@@ -0,0 +1,860 @@
+/*
+   NOTE: do not edit this file! It is auto-generated by the "tune" program.
+   (Run "make tune" and then "./tune > tuning.c" to regenerate it.)
+*/
+
+/*
+   tuning.c:  global tuning values
+
+   Copyright (C) 2007, 2008, David Harvey
+
+   This file is part of the zn_poly library (version 0.9).
+
+   This program is free software: you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation, either version 2 of the License, or
+   (at your option) version 3 of the License.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+*/
+
+#include "zn_poly_internal.h"
+
+size_t ZNP_mpn_smp_kara_thresh = 35;
+size_t ZNP_mpn_mulmid_fallback_thresh = 458;
+
+tuning_info_t tuning_info[] = 
+{
+   {  // bits = 0
+   },
+   {  // bits = 1
+   },
+   {  // bits = 2
+        141,   // KS1 -> KS2 multiplication threshold
+      14733,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+        102,   // KS1 -> KS2 squaring threshold
+       3602,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+        102,   // KS1 -> KS2 middle product threshold
+       1378,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 3
+        112,   // KS1 -> KS2 multiplication threshold
+       4308,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         61,   // KS1 -> KS2 squaring threshold
+       4817,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+        102,   // KS1 -> KS2 middle product threshold
+       2356,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 4
+         80,   // KS1 -> KS2 multiplication threshold
+       2576,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         57,   // KS1 -> KS2 squaring threshold
+       1801,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         72,   // KS1 -> KS2 middle product threshold
+       1053,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 5
+         67,   // KS1 -> KS2 multiplication threshold
+       3294,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         56,   // KS1 -> KS2 squaring threshold
+       2303,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         65,   // KS1 -> KS2 middle product threshold
+        901,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 6
+         57,   // KS1 -> KS2 multiplication threshold
+       1152,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         51,   // KS1 -> KS2 squaring threshold
+        985,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         62,   // KS1 -> KS2 middle product threshold
+        985,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 7
+         47,   // KS1 -> KS2 multiplication threshold
+       1540,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         43,   // KS1 -> KS2 squaring threshold
+        788,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         51,   // KS1 -> KS2 middle product threshold
+        451,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 8
+         47,   // KS1 -> KS2 multiplication threshold
+        901,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         35,   // KS1 -> KS2 squaring threshold
+        753,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         43,   // KS1 -> KS2 middle product threshold
+        576,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 9
+         43,   // KS1 -> KS2 multiplication threshold
+        589,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         43,   // KS1 -> KS2 squaring threshold
+        493,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         39,   // KS1 -> KS2 middle product threshold
+        302,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 10
+         38,   // KS1 -> KS2 multiplication threshold
+        824,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         35,   // KS1 -> KS2 squaring threshold
+        576,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         38,   // KS1 -> KS2 middle product threshold
+        337,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 11
+         43,   // KS1 -> KS2 multiplication threshold
+        431,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         33,   // KS1 -> KS2 squaring threshold
+        377,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         33,   // KS1 -> KS2 middle product threshold
+        247,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 12
+         38,   // KS1 -> KS2 multiplication threshold
+        482,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         33,   // KS1 -> KS2 squaring threshold
+        403,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         33,   // KS1 -> KS2 middle product threshold
+        216,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 13
+         33,   // KS1 -> KS2 multiplication threshold
+        345,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         30,   // KS1 -> KS2 squaring threshold
+        315,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         33,   // KS1 -> KS2 middle product threshold
+        189,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 14
+         33,   // KS1 -> KS2 multiplication threshold
+        345,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         29,   // KS1 -> KS2 squaring threshold
+        286,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         31,   // KS1 -> KS2 middle product threshold
+        116,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+         13    // nussbaumer squaring threshold
+   },
+   {  // bits = 15
+         31,   // KS1 -> KS2 multiplication threshold
+        322,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         23,   // KS1 -> KS2 squaring threshold
+        226,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         29,   // KS1 -> KS2 middle product threshold
+        121,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+       1000    // nussbaumer squaring threshold
+   },
+   {  // bits = 16
+         29,   // KS1 -> KS2 multiplication threshold
+        337,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         25,   // KS1 -> KS2 squaring threshold
+        173,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         29,   // KS1 -> KS2 middle product threshold
+        101,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+         13    // nussbaumer squaring threshold
+   },
+   {  // bits = 17
+         29,   // KS1 -> KS2 multiplication threshold
+        231,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         23,   // KS1 -> KS2 squaring threshold
+        216,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         27,   // KS1 -> KS2 middle product threshold
+         94,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+         13    // nussbaumer squaring threshold
+   },
+   {  // bits = 18
+         25,   // KS1 -> KS2 multiplication threshold
+        189,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         23,   // KS1 -> KS2 squaring threshold
+        134,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         23,   // KS1 -> KS2 middle product threshold
+         94,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         13    // nussbaumer squaring threshold
+   },
+   {  // bits = 19
+         27,   // KS1 -> KS2 multiplication threshold
+        226,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         21,   // KS1 -> KS2 squaring threshold
+        216,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         21,   // KS1 -> KS2 middle product threshold
+         86,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+       1000,   // nussbaumer multiplication threshold
+         13    // nussbaumer squaring threshold
+   },
+   {  // bits = 20
+         22,   // KS1 -> KS2 multiplication threshold
+        144,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         21,   // KS1 -> KS2 squaring threshold
+        119,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         21,   // KS1 -> KS2 middle product threshold
+         80,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         13    // nussbaumer squaring threshold
+   },
+   {  // bits = 21
+         25,   // KS1 -> KS2 multiplication threshold
+        189,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         19,   // KS1 -> KS2 squaring threshold
+        121,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         21,   // KS1 -> KS2 middle product threshold
+         80,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         13    // nussbaumer squaring threshold
+   },
+   {  // bits = 22
+         21,   // KS1 -> KS2 multiplication threshold
+        102,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         21,   // KS1 -> KS2 squaring threshold
+        102,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         21,   // KS1 -> KS2 middle product threshold
+         70,   // KS2 -> KS4 middle product threshold
+   SIZE_MAX,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         13    // nussbaumer squaring threshold
+   },
+   {  // bits = 23
+         22,   // KS1 -> KS2 multiplication threshold
+        147,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         17,   // KS1 -> KS2 squaring threshold
+        102,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         19,   // KS1 -> KS2 middle product threshold
+         72,   // KS2 -> KS4 middle product threshold
+      23040,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         12    // nussbaumer squaring threshold
+   },
+   {  // bits = 24
+         19,   // KS1 -> KS2 multiplication threshold
+         86,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         19,   // KS1 -> KS2 squaring threshold
+         72,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         19,   // KS1 -> KS2 middle product threshold
+         66,   // KS2 -> KS4 middle product threshold
+      20868,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         12    // nussbaumer squaring threshold
+   },
+   {  // bits = 25
+         21,   // KS1 -> KS2 multiplication threshold
+        149,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         19,   // KS1 -> KS2 squaring threshold
+        102,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         21,   // KS1 -> KS2 middle product threshold
+         66,   // KS2 -> KS4 middle product threshold
+      20868,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         12    // nussbaumer squaring threshold
+   },
+   {  // bits = 26
+         17,   // KS1 -> KS2 multiplication threshold
+         86,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         16,   // KS1 -> KS2 squaring threshold
+         72,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         17,   // KS1 -> KS2 middle product threshold
+         66,   // KS2 -> KS4 middle product threshold
+      16026,   // KS4 -> FFT middle product threshold
+         13,   // nussbaumer multiplication threshold
+         12    // nussbaumer squaring threshold
+   },
+   {  // bits = 27
+         17,   // KS1 -> KS2 multiplication threshold
+        119,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         16,   // KS1 -> KS2 squaring threshold
+         64,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         16,   // KS1 -> KS2 middle product threshold
+         66,   // KS2 -> KS4 middle product threshold
+      16026,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 28
+         23,   // KS1 -> KS2 multiplication threshold
+        107,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         14,   // KS1 -> KS2 squaring threshold
+         62,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         17,   // KS1 -> KS2 middle product threshold
+         66,   // KS2 -> KS4 middle product threshold
+       9451,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 29
+         17,   // KS1 -> KS2 multiplication threshold
+         85,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         14,   // KS1 -> KS2 squaring threshold
+         65,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         16,   // KS1 -> KS2 middle product threshold
+         61,   // KS2 -> KS4 middle product threshold
+      14044,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 30
+         17,   // KS1 -> KS2 multiplication threshold
+         66,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         17,   // KS1 -> KS2 squaring threshold
+         43,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         17,   // KS1 -> KS2 middle product threshold
+         61,   // KS2 -> KS4 middle product threshold
+      12307,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 31
+         17,   // KS1 -> KS2 multiplication threshold
+         73,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         13,   // KS1 -> KS2 squaring threshold
+         35,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         17,   // KS1 -> KS2 middle product threshold
+         57,   // KS2 -> KS4 middle product threshold
+      16026,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 32
+         17,   // KS1 -> KS2 multiplication threshold
+         56,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         14,   // KS1 -> KS2 squaring threshold
+         40,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         17,   // KS1 -> KS2 middle product threshold
+         56,   // KS2 -> KS4 middle product threshold
+       8013,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 33
+         16,   // KS1 -> KS2 multiplication threshold
+         47,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         15,   // KS1 -> KS2 squaring threshold
+         29,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         14,   // KS1 -> KS2 middle product threshold
+         51,   // KS2 -> KS4 middle product threshold
+       8560,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 34
+         16,   // KS1 -> KS2 multiplication threshold
+         47,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         15,   // KS1 -> KS2 squaring threshold
+         27,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         17,   // KS1 -> KS2 middle product threshold
+         47,   // KS2 -> KS4 middle product threshold
+       6154,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 35
+         16,   // KS1 -> KS2 multiplication threshold
+         47,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         13,   // KS1 -> KS2 squaring threshold
+         29,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         43,   // KS2 -> KS4 middle product threshold
+       7022,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 36
+         16,   // KS1 -> KS2 multiplication threshold
+         36,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         12,   // KS1 -> KS2 squaring threshold
+         29,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         43,   // KS2 -> KS4 middle product threshold
+       6154,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 37
+         14,   // KS1 -> KS2 multiplication threshold
+         36,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         12,   // KS1 -> KS2 squaring threshold
+         25,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         14,   // KS1 -> KS2 middle product threshold
+         47,   // KS2 -> KS4 middle product threshold
+       7022,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 38
+         16,   // KS1 -> KS2 multiplication threshold
+         30,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         13,   // KS1 -> KS2 squaring threshold
+         25,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         14,   // KS1 -> KS2 middle product threshold
+         43,   // KS2 -> KS4 middle product threshold
+       4726,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 39
+         15,   // KS1 -> KS2 multiplication threshold
+         32,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         12,   // KS1 -> KS2 squaring threshold
+         23,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         40,   // KS2 -> KS4 middle product threshold
+       5393,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         11    // nussbaumer squaring threshold
+   },
+   {  // bits = 40
+         13,   // KS1 -> KS2 multiplication threshold
+         32,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         12,   // KS1 -> KS2 squaring threshold
+         23,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         12,   // KS1 -> KS2 middle product threshold
+         38,   // KS2 -> KS4 middle product threshold
+       4726,   // KS4 -> FFT middle product threshold
+         12,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 41
+         13,   // KS1 -> KS2 multiplication threshold
+         25,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         12,   // KS1 -> KS2 squaring threshold
+         21,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         33,   // KS2 -> KS4 middle product threshold
+       4726,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 42
+         13,   // KS1 -> KS2 multiplication threshold
+         27,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         12,   // KS1 -> KS2 squaring threshold
+         23,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         33,   // KS2 -> KS4 middle product threshold
+       4280,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 43
+         13,   // KS1 -> KS2 multiplication threshold
+         24,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         12,   // KS1 -> KS2 squaring threshold
+         21,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         31,   // KS2 -> KS4 middle product threshold
+       4726,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 44
+         13,   // KS1 -> KS2 multiplication threshold
+         33,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         10,   // KS1 -> KS2 squaring threshold
+         18,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         33,   // KS2 -> KS4 middle product threshold
+       4280,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 45
+         12,   // KS1 -> KS2 multiplication threshold
+         23,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         10,   // KS1 -> KS2 squaring threshold
+         19,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         29,   // KS2 -> KS4 middle product threshold
+       3877,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 46
+         15,   // KS1 -> KS2 multiplication threshold
+         25,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         10,   // KS1 -> KS2 squaring threshold
+         17,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         13,   // KS1 -> KS2 middle product threshold
+         31,   // KS2 -> KS4 middle product threshold
+       3877,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 47
+         12,   // KS1 -> KS2 multiplication threshold
+         23,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         12,   // KS1 -> KS2 squaring threshold
+         17,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         12,   // KS1 -> KS2 middle product threshold
+         29,   // KS2 -> KS4 middle product threshold
+       3877,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 48
+         13,   // KS1 -> KS2 multiplication threshold
+         23,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          9,   // KS1 -> KS2 squaring threshold
+         17,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         10,   // KS1 -> KS2 middle product threshold
+         25,   // KS2 -> KS4 middle product threshold
+       3877,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 49
+         12,   // KS1 -> KS2 multiplication threshold
+         19,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          8,   // KS1 -> KS2 squaring threshold
+         16,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         10,   // KS1 -> KS2 middle product threshold
+         25,   // KS2 -> KS4 middle product threshold
+       3511,   // KS4 -> FFT middle product threshold
+         11,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 50
+         10,   // KS1 -> KS2 multiplication threshold
+         19,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          9,   // KS1 -> KS2 squaring threshold
+         17,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         12,   // KS1 -> KS2 middle product threshold
+         27,   // KS2 -> KS4 middle product threshold
+       3511,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 51
+         10,   // KS1 -> KS2 multiplication threshold
+         19,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         10,   // KS1 -> KS2 squaring threshold
+         16,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         10,   // KS1 -> KS2 middle product threshold
+         23,   // KS2 -> KS4 middle product threshold
+       2363,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 52
+         12,   // KS1 -> KS2 multiplication threshold
+         21,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+         10,   // KS1 -> KS2 squaring threshold
+         16,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          9,   // KS1 -> KS2 middle product threshold
+         24,   // KS2 -> KS4 middle product threshold
+       2697,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 53
+         12,   // KS1 -> KS2 multiplication threshold
+         19,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          8,   // KS1 -> KS2 squaring threshold
+         16,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         10,   // KS1 -> KS2 middle product threshold
+         24,   // KS2 -> KS4 middle product threshold
+       2140,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+         10    // nussbaumer squaring threshold
+   },
+   {  // bits = 54
+         12,   // KS1 -> KS2 multiplication threshold
+         19,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          9,   // KS1 -> KS2 squaring threshold
+         16,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         10,   // KS1 -> KS2 middle product threshold
+         23,   // KS2 -> KS4 middle product threshold
+       2363,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 55
+          9,   // KS1 -> KS2 multiplication threshold
+         16,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          8,   // KS1 -> KS2 squaring threshold
+         12,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         10,   // KS1 -> KS2 middle product threshold
+         19,   // KS2 -> KS4 middle product threshold
+       1939,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 56
+          9,   // KS1 -> KS2 multiplication threshold
+         16,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          8,   // KS1 -> KS2 squaring threshold
+         13,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          9,   // KS1 -> KS2 middle product threshold
+         19,   // KS2 -> KS4 middle product threshold
+       2140,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 57
+         10,   // KS1 -> KS2 multiplication threshold
+         16,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          9,   // KS1 -> KS2 squaring threshold
+         15,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          9,   // KS1 -> KS2 middle product threshold
+         21,   // KS2 -> KS4 middle product threshold
+       2071,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 58
+         12,   // KS1 -> KS2 multiplication threshold
+         16,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          9,   // KS1 -> KS2 squaring threshold
+         12,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+         12,   // KS1 -> KS2 middle product threshold
+         25,   // KS2 -> KS4 middle product threshold
+       2071,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 59
+          8,   // KS1 -> KS2 multiplication threshold
+         16,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          7,   // KS1 -> KS2 squaring threshold
+         13,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          8,   // KS1 -> KS2 middle product threshold
+         19,   // KS2 -> KS4 middle product threshold
+       1756,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 60
+          8,   // KS1 -> KS2 multiplication threshold
+         16,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          7,   // KS1 -> KS2 squaring threshold
+         13,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          8,   // KS1 -> KS2 middle product threshold
+         17,   // KS2 -> KS4 middle product threshold
+       1756,   // KS4 -> FFT middle product threshold
+          9,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 61
+          8,   // KS1 -> KS2 multiplication threshold
+         12,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          7,   // KS1 -> KS2 squaring threshold
+         12,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          8,   // KS1 -> KS2 middle product threshold
+         19,   // KS2 -> KS4 middle product threshold
+       1939,   // KS4 -> FFT middle product threshold
+          9,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 62
+         10,   // KS1 -> KS2 multiplication threshold
+         19,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          7,   // KS1 -> KS2 squaring threshold
+         14,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          9,   // KS1 -> KS2 middle product threshold
+         23,   // KS2 -> KS4 middle product threshold
+       2697,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 63
+          9,   // KS1 -> KS2 multiplication threshold
+         25,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          7,   // KS1 -> KS2 squaring threshold
+         19,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          9,   // KS1 -> KS2 middle product threshold
+         23,   // KS2 -> KS4 middle product threshold
+       2363,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+   {  // bits = 64
+          9,   // KS1 -> KS2 multiplication threshold
+         31,   // KS2 -> KS4 multiplication threshold
+   SIZE_MAX,   // KS4 -> FFT multiplication threshold
+          7,   // KS1 -> KS2 squaring threshold
+         25,   // KS2 -> KS4 squaring threshold
+   SIZE_MAX,   // KS4 -> FFT squaring threshold
+          9,   // KS1 -> KS2 middle product threshold
+         25,   // KS2 -> KS4 middle product threshold
+       2697,   // KS4 -> FFT middle product threshold
+         10,   // nussbaumer multiplication threshold
+          9    // nussbaumer squaring threshold
+   },
+};
+
+// end of file ****************************************************************
diff --git a/srcpkgs/zn_poly/template b/srcpkgs/zn_poly/template
new file mode 100644
index 000000000000..ca6d0c764f16
--- /dev/null
+++ b/srcpkgs/zn_poly/template
@@ -0,0 +1,32 @@
+# Template file for 'zn_poly'
+pkgname=zn_poly
+version=0.9.2
+revision=1
+build_style=configure
+configure_args="--prefix=\$(DESTDIR)/usr"
+hostmakedepends="python3"
+makedepends="gmp-devel"
+short_desc="Library for polynomial arithmetic in Z/nZ[x], for unsigned long n"
+maintainer="Gonzalo Tornaría <tornaria@cmat.edu.uy>"
+license="GPL-2.0-only, GPL-3.0-only"
+homepage="https://gitlab.com/sagemath/zn_poly"
+distfiles="https://gitlab.com/sagemath/zn_poly/-/archive/${version}/zn_poly-${version}.tar.bz2"
+checksum=29d88ce19939f53e920adf118d8cd6c8c9594bc8cb71a992a6137bd86f6fb7f5
+
+CFLAGS=-fPIC
+
+build_options="native_build"
+
+if [ -z "$build_option_native_build" ]; then
+	configure_args+=" --disable-tuning"
+fi
+
+post_extract() {
+	cp -v ${FILESDIR}/tuning-${XBPS_WORDSIZE}.c tune/tuning.c
+}
+
+do_configure() {
+	./configure ${configure_args} \
+		--cflags="$CFLAGS" --ldflags="$LDFLAGS" \
+		--cppflags="$CPPFLAGS" --cxxflags="$CXXFLAGS"
+}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: New package: zn_poly-0.9.2
  2021-11-09  1:24 [PR PATCH] New package: zn_poly-0.9.2 tornaria
@ 2021-11-09  1:28 ` tornaria
  2021-11-09 12:19 ` dkwo
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: tornaria @ 2021-11-09  1:28 UTC (permalink / raw)
  To: ml

[-- Attachment #1: Type: text/plain, Size: 182 bytes --]

New comment by tornaria on void-packages repository

https://github.com/void-linux/void-packages/pull/33969#issuecomment-963729836

Comment:
@dkwo @leahneukirchen this works for me.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: New package: zn_poly-0.9.2
  2021-11-09  1:24 [PR PATCH] New package: zn_poly-0.9.2 tornaria
  2021-11-09  1:28 ` tornaria
@ 2021-11-09 12:19 ` dkwo
  2021-11-09 12:42 ` tornaria
  2021-11-09 23:10 ` [PR PATCH] [Merged]: " leahneukirchen
  3 siblings, 0 replies; 5+ messages in thread
From: dkwo @ 2021-11-09 12:19 UTC (permalink / raw)
  To: ml

[-- Attachment #1: Type: text/plain, Size: 198 bytes --]

New comment by dkwo on void-packages repository

https://github.com/void-linux/void-packages/pull/33969#issuecomment-964100644

Comment:
On native, is the purpose of those files to save build time?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: New package: zn_poly-0.9.2
  2021-11-09  1:24 [PR PATCH] New package: zn_poly-0.9.2 tornaria
  2021-11-09  1:28 ` tornaria
  2021-11-09 12:19 ` dkwo
@ 2021-11-09 12:42 ` tornaria
  2021-11-09 23:10 ` [PR PATCH] [Merged]: " leahneukirchen
  3 siblings, 0 replies; 5+ messages in thread
From: tornaria @ 2021-11-09 12:42 UTC (permalink / raw)
  To: ml

[-- Attachment #1: Type: text/plain, Size: 538 bytes --]

New comment by tornaria on void-packages repository

https://github.com/void-linux/void-packages/pull/33969#issuecomment-964116800

Comment:
> On native, is the purpose of those files to save build time?

No, tuning doesn't take long. It just seems easier and more stable not to run tuning on the builders at all. Note that tuning might be sensitive to cpu load, so it's preferably to run on an otherwise idle box, which builders may not always be.

If a tuned build is needed, one can use `-o native_build` which will enable tuning.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PR PATCH] [Merged]: New package: zn_poly-0.9.2
  2021-11-09  1:24 [PR PATCH] New package: zn_poly-0.9.2 tornaria
                   ` (2 preceding siblings ...)
  2021-11-09 12:42 ` tornaria
@ 2021-11-09 23:10 ` leahneukirchen
  3 siblings, 0 replies; 5+ messages in thread
From: leahneukirchen @ 2021-11-09 23:10 UTC (permalink / raw)
  To: ml

[-- Attachment #1: Type: text/plain, Size: 492 bytes --]

There's a merged pull request on the void-packages repository

New package: zn_poly-0.9.2
https://github.com/void-linux/void-packages/pull/33969

Description:
Dependency for sage. I compiled sage-9.4 using this.

I run tuning in my box, once in 64 bit mode, once in 32 bit mode, and hardcoded the resulting tuning files. Won't be optimal for all cpus, particularly for non-x86, but it seems a fair solution and avoids running tuning each time. Tuning won't work for cross compiling anyway.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-09 23:10 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-09  1:24 [PR PATCH] New package: zn_poly-0.9.2 tornaria
2021-11-09  1:28 ` tornaria
2021-11-09 12:19 ` dkwo
2021-11-09 12:42 ` tornaria
2021-11-09 23:10 ` [PR PATCH] [Merged]: " leahneukirchen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).