Skip to content

Fix advertising not resuming in extended mode#51309

Closed
tgagneret-embedded wants to merge 1 commit intozephyrproject-rtos:mainfrom
tgagneret-embedded:fix/persistent_advertising
Closed

Fix advertising not resuming in extended mode#51309
tgagneret-embedded wants to merge 1 commit intozephyrproject-rtos:mainfrom
tgagneret-embedded:fix/persistent_advertising

Conversation

@tgagneret-embedded
Copy link
Contributor

Flag CONFIG_BT_EXT_ADV is only relevant and used in legacy mode. It should not be taken into account when using extended advertising.

Flag CONFIG_BT_EXT_ADV is only relevant and used in legacy mode. It should
not be taken into account when using extended advertising.

Signed-off-by: Thomas Gagneret <tgagneret@witekio.com>
@tgagneret-embedded
Copy link
Contributor Author

Hi,

My problem comes from the fact that when I advertise and one device connects, even if CONFIG_BT_MAX_CONN is set to 10, advertising is not resumed automatically.

After some verification it seems that the problem is elsewhere. It is possible to enable CONFIG_BT_EXT_ADV but start legacy advertising. But doing this seems to break the resume because for some reason BT_ADV_ENABLED is cleared only if CONFIG_BT_EXT_ADV is not enabled. So I think that verification does not use the correct information.

If I disabled CONFIG_BT_EXT_ADV from my prj.conf, then I have the expected behavior.

So there is a problem I think, but this PR does not solve anything.

Thanks

@carlescufi
Copy link
Member

@tgagneret-embedded could this be the fix you are looking for? #51256

@tgagneret-embedded
Copy link
Contributor Author

tgagneret-embedded commented Oct 15, 2022

Hi,

I don't think so. If you want to reproduce my issue:

  • Compile with CONFIG_BT_EXT_ADV enabled and CONFIG_BT_MAX_CONN set to 5
  • In your main, start LEGACY advertising with BT_LE_ADV_OPT_ONE_TIME bit not set (bt_le_adv_start)

Expected behavior:
When a device connects you should advertise again since CONFIG_BT_MAX_CONN is > 1.

Actual behavior:
Advertising does not resume.

Analysis:
hci_core.c calls for bt_le_adv_resume but in this function it checks BT_ADV_ENABLED and if this bit is set it does not resume advertisin. However this bits seems to be cleared only if CONFIG_BT_EXT_ADV is NOT enabled during compilation. Not sure why.

My fix clear this issue but I'm not sure it's the root problem.

@carlescufi
Copy link
Member

@tgagneret-embedded could you please open a GitHub issue with the info you provided?

Copy link
Contributor

@ppryga-nordic ppryga-nordic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the problem should be fixed in right place, which is in my humble opinion here:

if (IS_ENABLED(CONFIG_BT_PERIPHERAL) &&
evt->role == BT_HCI_ROLE_PERIPHERAL &&
!(IS_ENABLED(CONFIG_BT_EXT_ADV) &&
BT_DEV_FEAT_LE_EXT_ADV(bt_dev.le.features))) {
struct bt_le_ext_adv *adv = bt_le_adv_lookup_legacy();
/* Clear advertising even if we are not able to add connection
* object to keep host in sync with controller state.
*/
atomic_clear_bit(adv->flags, BT_ADV_ENABLED);
(void)bt_le_lim_adv_cancel_timeout(adv);
}
.

In case CONFIG_BT_EXT_ADV is enabled and a Controller's extended advertising feature bit is set the BT_ADV_ENABLED bit is not cleared.

BTW. The commit message does not align witch changes done in the commit. It is hard to connect reason of the change with the change itself.

@Thalley
Copy link
Contributor

Thalley commented Oct 17, 2022

I am not sure I follow this change at all. The commit message doesn't make sense to me:

Flag CONFIG_BT_EXT_ADV is only relevant and used in legacy mode. It should not be taken into account when using extended advertising.

CONFIG_BT_EXT_ADV is Kconfig option to enable the extended advertising support, and should absolutely be taken into account when using extended advertising.

The PR title suggests that this fix will enable persisting advertising for extended advertisers, which, based on the changes, it doesn't seem to do.

Extended advertising is currently, by design, not persisting, see e.g.

	/**
	 * @brief Advertise one time.
	 *
	 * Don't try to resume connectable advertising after a connection.
	 * This option is only meaningful when used together with
	 * BT_LE_ADV_OPT_CONNECTABLE. If set the advertising will be stopped
	 * when bt_le_adv_stop() is called or when an incoming (peripheral)
	 * connection happens. If this option is not set the stack will
	 * take care of keeping advertising enabled even as connections
	 * occur.
	 * If Advertising directed or the advertiser was started with
	 * @ref bt_le_ext_adv_start then this behavior is the default behavior
	 * and this flag has no effect.
	 */
	BT_LE_ADV_OPT_ONE_TIME = BIT(1),

The properly support persistent extended advertising, I created #30699 a while ago, and this PR seems to miss several of the challenges of implementing this.

@tgagneret-embedded
Copy link
Contributor Author

Hi,

You're right @Thalley , my message is not correct at all. The goal is to resume a LEGACY advertising (started by bt_le_adv_start) when EXTENDED is enabled during compilation.

Maybe my fix is not the correct one. The problem is probably before this and @ppryga-nordic points to the right problem I think:
CONFIG_BT_EXT_ADV should not be used to know if you are advertising using extended mode, you should have a context depending on "bt_le_adv_start" or "bt_le_ext_adv_start" and used it to clear or not the BT_ADV_ENABLED flag.

@Thalley
Copy link
Contributor

Thalley commented Oct 17, 2022

Hi,

You're right @Thalley , my message is not correct at all. The goal is to resume a LEGACY advertising (started by bt_le_adv_start) when EXTENDED is enabled during compilation.

Maybe my fix is not the correct one. The problem is probably before this and @ppryga-nordic points to the right problem I think: CONFIG_BT_EXT_ADV should not be used to know if you are advertising using extended mode, you should have a context depending on "bt_le_adv_start" or "bt_le_ext_adv_start" and used it to clear or not the BT_ADV_ENABLED flag.

If by extended mode, you mean that the advertising set is not setting the legacy flag (BT_HCI_LE_ADV_PROP_LEGACY) in the advertising properties then you are partly correct. For that we should check the BT_LE_ADV_OPT_EXT_ADV flag instead.

image

Keep in mind that bt_le_ext_adv_start does NOT always start an extended advertising set, and can be used for a legacy advertising set as well.

Basically CONFIG_BT_EXT_ADV determines whether the host sends the extended advertising HCI commands to the controller, or the legacy HCI commands, and does not determine if the advertising is a legacy advertiser, or an extended advertiser.

bt_le_adv_start always creates a legacy advertising set (either using the legacy HCI commands, or the extended HCI commands).
bt_le_ext_adv_start may start either a legacy or an extended advertising set, always using the extended HCI commands.

@tgagneret-embedded
Copy link
Contributor Author

To clarify and make sure we understand because I'm not sure to understand the solution here.

First forget about my commit it's the wrong one.

Second, we establish that the problem comes from these 2 samples:

if (IS_ENABLED(CONFIG_BT_PERIPHERAL) &&
evt->role == BT_HCI_ROLE_PERIPHERAL &&
!(IS_ENABLED(CONFIG_BT_EXT_ADV) &&
BT_DEV_FEAT_LE_EXT_ADV(bt_dev.le.features))) {
struct bt_le_ext_adv *adv = bt_le_adv_lookup_legacy();
/* Clear advertising even if we are not able to add connection
* object to keep host in sync with controller state.
*/
atomic_clear_bit(adv->flags, BT_ADV_ENABLED);
(void)bt_le_lim_adv_cancel_timeout(adv);
}

if (!(atomic_test_bit(adv->flags, BT_ADV_PERSIST) &&
!atomic_test_bit(adv->flags, BT_ADV_ENABLED))) {
return;
}

So even ifCONFIG_BT_EXT_ADV is enabled I can call bt_le_adv_start (legacy) , so the flag BT_ADV_ENABLED should not be cleared depending on CONFIG_BT_EXT_ADV but what kind of advertising I called at runtime.

So instead of testing CONFIG_BT_EXT_ADV is enabled we should test !atomic_test_bit(adv->flags, BT_ADV_EXT_ADV).

Doing this means means calling struct bt_le_ext_adv *adv = bt_le_adv_lookup_legacy(); in all configuration CONFIG_BT_EXT_ADV enable or not.
Since bt_le_adv_lookup_legacy returns the correct pointer depending on CONFIG_BT_EXT_ADV value, I guess it's not a problem.

struct bt_le_ext_adv *bt_le_adv_lookup_legacy(void)
{
#if defined(CONFIG_BT_EXT_ADV)
return bt_dev.adv;
#else
return &bt_dev.adv;
#endif
}

Am I correct ? If so, let me know I can update the PR if you want.

Thanks

@Thalley
Copy link
Contributor

Thalley commented Oct 18, 2022

To clarify and make sure we understand because I'm not sure to understand the solution here.

First forget about my commit it's the wrong one.

Second, we establish that the problem comes from these 2 samples:

if (IS_ENABLED(CONFIG_BT_PERIPHERAL) &&
evt->role == BT_HCI_ROLE_PERIPHERAL &&
!(IS_ENABLED(CONFIG_BT_EXT_ADV) &&
BT_DEV_FEAT_LE_EXT_ADV(bt_dev.le.features))) {
struct bt_le_ext_adv *adv = bt_le_adv_lookup_legacy();
/* Clear advertising even if we are not able to add connection
* object to keep host in sync with controller state.
*/
atomic_clear_bit(adv->flags, BT_ADV_ENABLED);
(void)bt_le_lim_adv_cancel_timeout(adv);
}

if (!(atomic_test_bit(adv->flags, BT_ADV_PERSIST) &&
!atomic_test_bit(adv->flags, BT_ADV_ENABLED))) {
return;
}

So even ifCONFIG_BT_EXT_ADV is enabled I can call bt_le_adv_start (legacy) , so the flag BT_ADV_ENABLED should not be cleared depending on CONFIG_BT_EXT_ADV but what kind of advertising I called at runtime.

So instead of testing CONFIG_BT_EXT_ADV is enabled we should test !atomic_test_bit(adv->flags, BT_ADV_EXT_ADV).

No, because CONFIG_BT_EXT_ADV indicates which HCI commands were used. If CONFIG_BT_EXT_ADV=y, then the extended HCI commands are used, and in which case we use the bt_hci_le_adv_set_terminated to clear the advertising state (regardless of whether it was started with bt_le_adv_start or bt_le_ext_adv_start. So the clearing of the advertising state in bt_hci_le_enh_conn_complete should only be done if CONFIG_BT_EXT_ADV=y, because when using the legacy commands, there is no separate advertising terminated event.

Doing this means means calling struct bt_le_ext_adv *adv = bt_le_adv_lookup_legacy(); in all configuration CONFIG_BT_EXT_ADV enable or not. Since bt_le_adv_lookup_legacy returns the correct pointer depending on CONFIG_BT_EXT_ADV value, I guess it's not a problem.

struct bt_le_ext_adv *bt_le_adv_lookup_legacy(void)
{
#if defined(CONFIG_BT_EXT_ADV)
return bt_dev.adv;
#else
return &bt_dev.adv;
#endif
}

Not sure I fully followed this part, but yes bt_le_adv_lookup_legacy can always be called to get the advertising set, if any, using legacy advertising (i.e. the one and only advertising set if CONFIG_BT_EXT_ADV=n, or the one and only extended advertising set with BT_ADV_EXT_ADV not set).

@tgagneret-embedded
Copy link
Contributor Author

Hi,

So maybe the problem is not from Zephyr but the firmware in the CPU NET (on nrf5340).
I use the firmware from Nordic : https://github.com/nrfconnect/sdk-nrf/tree/v2.1.0/applications/nrf5340_audio/bin

With our conversation I understand that when CONFIG_BT_EXT_ADV is enabled we use an other set of HCI commands and the CPU NET should resume automatically advertising if properly configured.

So if I think the problem might come the CPU NET firmware. Do you agree ?

Thanks

@Thalley
Copy link
Contributor

Thalley commented Oct 18, 2022

So maybe the problem is not from Zephyr but the firmware in the CPU NET (on nrf5340). I use the firmware from Nordic : https://github.com/nrfconnect/sdk-nrf/tree/v2.1.0/applications/nrf5340_audio/bin

With our conversation I understand that when CONFIG_BT_EXT_ADV is enabled we use an other set of HCI commands and the CPU NET should resume automatically advertising if properly configured.

Advertising is never resumed automatically by the controller (or the CPU NET as you use here) - That is a feature that has been implemented for legacy advertising in the Zephyr host

@tgagneret-embedded
Copy link
Contributor Author

Hi,

Just to make sure I understand correctly:
if CONFIG_BT_EXT_ADV enabled in my prj.conf andI start a legacy advertising using bt_le_adv_start, it should resume immediately after a connection is completed (giving CONFIG_BT_MAX_CONN > 1). Do you agree ?

My program does not resume advertising in these condition. So if you agree, can you point me to the right direction on how to solve my issue ?

Thanks

@Thalley
Copy link
Contributor

Thalley commented Oct 18, 2022

Hi,

Just to make sure I understand correctly: if CONFIG_BT_EXT_ADV enabled in my prj.conf andI start a legacy advertising using bt_le_adv_start, it should resume immediately after a connection is completed (giving CONFIG_BT_MAX_CONN > 1). Do you agree ?

My program does not resume advertising in these condition. So if you agree, can you point me to the right direction on how to solve my issue ?

Thanks

Assuming that BT_LE_ADV_OPT_ONE_TIME is not set, then yes.

Are you running your own application? Can you replicate the issue using the https://github.com/zephyrproject-rtos/zephyr/tree/main/samples/bluetooth/peripheral sample, which should work that way.

If I run that sample with CONFIG_BT_MAX_CONN=2 it works as intended using the main branch.

@tgagneret-embedded
Copy link
Contributor Author

tgagneret-embedded commented Oct 18, 2022

Hi,

If I run the bluetooth/peripheral sample (from v3.2.0 tag and using nrf5340dk_nrf5340 board) with the following patch, advertising never starts:

diff --git a/samples/bluetooth/peripheral/prj.conf b/samples/bluetooth/peripheral/prj.conf
index 549a30da79..bc5a4fb0b4 100644
--- a/samples/bluetooth/peripheral/prj.conf
+++ b/samples/bluetooth/peripheral/prj.conf
@@ -24,3 +24,6 @@ CONFIG_FLASH_PAGE_LAYOUT=y
 CONFIG_FLASH_MAP=y
 CONFIG_NVS=y
 CONFIG_SETTINGS=y
+
+CONFIG_BT_MAX_CONN=10
+CONFIG_BT_EXT_ADV=y

However if CONFIG_BT_EXT_ADV is not set in the prj.conf, advertising resume as expected.

FYI, I'm using NRF Connect SDK, so my issue is probably not be related to mainline Zephyr. I will open a ticket at Nordic for this issue.

@tgagneret-embedded
Copy link
Contributor Author

tgagneret-embedded commented Oct 18, 2022

Hi,

Sorry, I forgot to add CONFIG_BT_EXT_ADV in CPU NET prj.conf.

So "advertising never starts" is not true. Instead I can reproduce the initial issue, advertising never resume after a connection.

Sorry for that.

@Thalley
Copy link
Contributor

Thalley commented Oct 19, 2022

Hi,

Sorry, I forgot to add CONFIG_BT_EXT_ADV in CPU NET prj.conf.

So "advertising never starts" is not true. Instead I can reproduce the initial issue, advertising never resume after a connection.

Sorry for that.

Thanks for testing. I had forgotten to CONFIG_BT_EXT_ADV when testing myself, and I see the same, so clearly there is an issue here. I assume this is the issue you wanted to fix with your PR?

I don't have the solution to the issue on hand, but if you can verify that your PR fixes this, then that's great.

Alternatively, if you don't

@tgagneret-embedded
Copy link
Contributor Author

Hi,

Yes this is the issue I want to fix even if I'm not sure the PR fixes it correctly.

Thanks.

@Thalley
Copy link
Contributor

Thalley commented Oct 19, 2022

Hi,

Yes this is the issue I want to fix even if I'm not sure the PR fixes it correctly.

Thanks.

If you run the sample with your PR, does it correct resume the advertising?

But yeah, it looks like Zephyr has a problem with handling this, as the description for BT_LE_ADV_OPT_ONE_TIME does not match how bt_le_adv_start works.

Furthermore, the concept of actual act of resuming the advertising after connection isn't something that I can find documentation for in bluetooth.h either.

I would need to spend more time on this before I can figure out what needs to be done, than I have time for right now. Perhaps @alwa-nordic can help here?

@ppryga-nordic
Copy link
Contributor

Hi, I've pointed where is the actual source of the issue in my comment: #51309 (review). Following on information there. If atomic_clear_bit(adv->flags, BT_ADV_ENABLED); not executed, then in a call to bt_le_adv_resume this if is true

if (!(atomic_test_bit(adv->flags, BT_ADV_PERSIST) &&
!atomic_test_bit(adv->flags, BT_ADV_ENABLED))) {
return;
}
and resume if not done. The function returns.

This is caused by check

!(IS_ENABLED(CONFIG_BT_EXT_ADV) &&
.

The advertising resume after disconnect will not work when CONFIG_BT_EXT_ADV is enabled.

The change required update of conditions in bt_hci_le_enh_conn_complete, but...
What if there is a pending extended advertising enabled with other advertising set?

What more, it looks to me that bt_le_adv_start can set extended advertising just by use of BT_LE_ADV_OPT_EXT_ADV in advertising parameters options field. In my humble opinion this bit should be explicitly cleaned in BT_LE_ADV_OPT_EXT_ADV before

err = bt_le_adv_start_ext(adv, param, ad, ad_len, sd, sd_len);
.

BTW. I've just found that there are bt_le_adv_start_ext and bt_le_ext_adv_start, both are globals. The first one is used only internally in adv.c... that is confusing. Both of them call bt_le_adv_set_enable_ext that is responsible for execution of HCI command LE Set extended advertising enable

@Thalley
Copy link
Contributor

Thalley commented Oct 20, 2022

Hi, I've pointed where is the actual source of the issue in my comment: #51309 (review). Following on information there. If atomic_clear_bit(adv->flags, BT_ADV_ENABLED); not executed, then in a call to bt_le_adv_resume this if is true

if (!(atomic_test_bit(adv->flags, BT_ADV_PERSIST) &&
!atomic_test_bit(adv->flags, BT_ADV_ENABLED))) {
return;
}

and resume if not done. The function returns.

This is caused by check

!(IS_ENABLED(CONFIG_BT_EXT_ADV) &&

.

The advertising resume after disconnect will not work when CONFIG_BT_EXT_ADV is enabled.

The change required update of conditions in bt_hci_le_enh_conn_complete, but... What if there is a pending extended advertising enabled with other advertising set?

What more, it looks to me that bt_le_adv_start can set extended advertising just by use of BT_LE_ADV_OPT_EXT_ADV in advertising parameters options field. In my humble opinion this bit should be explicitly cleaned in BT_LE_ADV_OPT_EXT_ADV before

err = bt_le_adv_start_ext(adv, param, ad, ad_len, sd, sd_len);

.

BTW. I've just found that there are bt_le_adv_start_ext and bt_le_ext_adv_start, both are globals. The first one is used only internally in adv.c... that is confusing. Both of them call bt_le_adv_set_enable_ext that is responsible for execution of HCI command LE Set extended advertising enable

I find it all confusing too.

Both bt_le_ext_adv_start and bt_le_adv_start may in theory advertise either legacy or extended if BT_LE_ADV_OPT_EXT_ADV is set. However, bt_le_adv_start will always use the advertising set returned by adv_get_legacy. Perhaps we should start by rejecting calls to bt_le_adv_start with BT_LE_ADV_OPT_EXT_ADV?

Another issue is that advertising is, with CONFIG_BT_EXT_ADV=n, always restarting unless specified not to, but that is never documented AFAIK.

Yet another issue is that BT_LE_ADV_OPT_ONE_TIME is documented as the default behavior when using bt_le_ext_adv_start, but it clearly is the default behavior when CONFIG_BT_EXT_ADV=n as it is.

Maybe these issues need to be tackled at the same time, or maybe they should be tackled individually.

@ppryga-nordic
Copy link
Contributor

I have no strong opinion whether approach this as single issue or in multiple steps.

BTW. There is another issue related to advertising resume, it could be addressed at the same time: #50438.

@jori-nordic
Copy link
Contributor

@tgagneret-embedded I have created an issue to track this, and will try to come up with a proper fix.

@jori-nordic
Copy link
Contributor

Closing the PR since we all seem to agree this is not the correct or full solution. The issue will be tracked in #53048 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment