2023-07-24 12:58 GMT+03:00, kemal <kemalinanc8@gmail.com>:
> 2023-07-24 5:35 GMT+03:00, ori@eigenstate.org <ori@eigenstate.org>:
>> Quoth kemal <kemalinanc8@gmail.com>:
>>>
>>> my mail didn't end up in /n/lists/9front, so the mail server may have
>>> ate my mail. i'm going to reply to my original message and upload
>>> my patch to okturing instead of attaching it, hoping that it will get
>>> sent
>>> http://okturing.com/src/16441/body
>>
>> this breaks wireless on my t460s (Intel Wireless 8260, 8086:24f3).
>>
>> Before, it would often give me firmware crashes, but with this it
>> does it on every bind:
>>
>> 	 #l1: fatal firmware error
>> 	 lastcmd: 108 (0x6c)
>> 	 error:  id 2b12, trm_hw_status 000002f0 00000000,
>> 	         branchlink2 0002395c, interruptlink 0003867a 00000000,
>> 	         errordata 80087cca 7acdf1f6 00000000
>> 	 #l1: flushq: broken
>>
>> happy to help debug, but not sure where to start.
>>
>>
>
> first of all, thank you for testing!
>
> as for what happens here, it errors on the command i changed. firmware
> makes the situation worse by sending us an unknown error id (0x2b12),
> but i at least know that it's not because we're sending a bad command but
> because firmware breaks for a cryptic reason. just for reference,
> i'll leave the error id table from openbsd here:
>
> static struct {
> 	const char *name;
> 	uint8_t num;
> } advanced_lookup[] = {
> 	{ "NMI_INTERRUPT_WDG", 0x34 },
> 	{ "SYSASSERT", 0x35 },
> 	{ "UCODE_VERSION_MISMATCH", 0x37 },
> 	{ "BAD_COMMAND", 0x38 },
> 	{ "BAD_COMMAND", 0x39 },
> 	{ "NMI_INTERRUPT_DATA_ACTION_PT", 0x3C },
> 	{ "FATAL_ERROR", 0x3D },
> 	{ "NMI_TRM_HW_ERR", 0x46 },
> 	{ "NMI_INTERRUPT_TRM", 0x4C },
> 	{ "NMI_INTERRUPT_BREAK_POINT", 0x54 },
> 	{ "NMI_INTERRUPT_WDG_RXF_FULL", 0x5C },
> 	{ "NMI_INTERRUPT_WDG_NO_RBD_RXF_FULL", 0x64 },
> 	{ "NMI_INTERRUPT_HOST", 0x66 },
> 	{ "NMI_INTERRUPT_LMAC_FATAL", 0x70 },
> 	{ "NMI_INTERRUPT_UMAC_FATAL", 0x71 },
> 	{ "NMI_INTERRUPT_OTHER_LMAC_FATAL", 0x73 },
> 	{ "NMI_INTERRUPT_ACTION_PT", 0x7C },
> 	{ "NMI_INTERRUPT_UNKNOWN", 0x84 },
> 	{ "NMI_INTERRUPT_INST_ACTION_PT", 0x86 },
> 	{ "ADVANCED_SYSASSERT", 0 },
> };
> (unknown errors are ADVANCED_SYSASSERT)
>
> sadly, the rest of the error is useless as no one except intel knows the
> firmwares internals. i want to know why it fails, so i added some debug
> prints to deduce something.
> diff with debug prints attached.
>
> for reference, this is what openbsd does:
>
> int
> iwm_send_phy_db_cmd(struct iwm_softc *sc, uint16_t type, uint16_t length,
>     void *data)
> {
> 	struct iwm_phy_db_cmd phy_db_cmd;
> 	struct iwm_host_cmd cmd = {
> 		.id = IWM_PHY_DB_CMD,
> 		.flags = IWM_CMD_ASYNC,
> 	};
>
> 	phy_db_cmd.type = le16toh(type);
> 	phy_db_cmd.length = le16toh(length);
>
> 	cmd.data[0] = &phy_db_cmd;
> 	cmd.len[0] = sizeof(struct iwm_phy_db_cmd);
> 	cmd.data[1] = data;
> 	cmd.len[1] = length;
>
> 	return iwm_send_cmd(sc, &cmd);
> }
>
> and this is the structure:
>
> struct iwm_phy_db_cmd {
> 	uint16_t type;
> 	uint16_t length;
> 	uint8_t data[];
> } __packed;
>
> (data is unused)
>

sorry. after staring at the code for a bit, i realised that
our code kept the type+length part of the calib block sent
by the firmware. for some reason, openbsd+linux seperates
that and sends it in the command buffer.

i aligned the behavior, but this shouldn't change anything, and
this is probably not related at all to the speed issue we're
experiencing

may you send a snippet of the firmware crashes you experience?

diff attached