Skip to content
This repository has been archived by the owner on Sep 7, 2020. It is now read-only.

Make agents wait for dev_set_config/WPS PBC before master discovery #1572

Open
wants to merge 28 commits into
base: master
Choose a base branch
from

Conversation

vitalii-komisarenko
Copy link
Collaborator

@vitalii-komisarenko vitalii-komisarenko commented Jul 28, 2020

https://jira.prplfoundation.org/browse/PPM-143

In certification mode the agent state machine should stop before the MASTER_DISCOVERY state until dev_set_config is received (wired backhaul) or start_wps_registration (wireless backhaul).

This PR should be merged after PR #1590.

@ghost
Copy link

ghost commented Jul 30, 2020

@vitalii-komisarenko please also check to see if there is any documentation about the state machine.
If it exists -> update it
Otherwise, it could be good to add it

@vitalii-komisarenko vitalii-komisarenko force-pushed the task/PPM-143-stop-on-failure-attempts branch 4 times, most recently from 786bbd6 to 3b77a85 Compare August 3, 2020 12:58
@ghost ghost requested a review from itayx August 4, 2020 09:04
@itayx
Copy link
Collaborator

itayx commented Aug 4, 2020

Log entries taken from the agent logs (Filtered to show only the backhaul_manager_thread FSM prints)

From the master branch

TRACE 10:23:42:970 <140023053342464> backhaul_manager_thread.cpp[795] --> FSM: INIT --> WAIT_ENABLE
TRACE 10:23:46:045 <140023053342464> backhaul_manager_thread.cpp[809] --> FSM: WAIT_ENABLE --> ENABLED
TRACE 10:23:46:546 <140023053342464> backhaul_manager_thread.cpp[900] --> FSM: ENABLED --> MASTER_DISCOVERY
TRACE 10:23:46:547 <140023053342464> backhaul_manager_thread.cpp[935] --> FSM: MASTER_DISCOVERY --> SEND_AUTOCONFIG_SEARCH_MESSAGE
TRACE 10:23:47:055 <140023053342464> backhaul_manager_thread.cpp[973] --> FSM: SEND_AUTOCONFIG_SEARCH_MESSAGE --> WAIT_FOR_AUTOCONFIG_RESPONSE_MESSAGE
TRACE 10:23:47:058 <140023053342464> backhaul_manager_thread.cpp[3585] --> FSM: WAIT_FOR_AUTOCONFIG_RESPONSE_MESSAGE --> CONNECT_TO_MASTER
TRACE 10:23:47:058 <140023053342464> backhaul_manager_thread.cpp[983] --> FSM: CONNECT_TO_MASTER --> CONNECTED
TRACE 10:23:47:066 <140023053342464> backhaul_manager_thread.cpp[998] --> FSM: CONNECTED --> OPERATIONAL

From this PR

TRACE 10:26:13:974 <140670739035904> backhaul_manager_thread.cpp[795] --> FSM: INIT --> WAIT_ENABLE
TRACE 10:26:17:060 <140670739035904> backhaul_manager_thread.cpp[809] --> FSM: WAIT_ENABLE --> ENABLED
TRACE 10:26:17:561 <140670739035904> backhaul_manager_thread.cpp[900] --> FSM: ENABLED --> READY_FOR_MASTER_DISCOVERY
ERROR 11:08:03:977 <140671996204800> backhaul_manager_thread.cpp[921] --> m_agent_ucc_listener == nullptr

It looks as if the FSM goes to state READY_FOR_MASTER_DISCOVERY then fails because the listener is null

@vitalii-komisarenko
Copy link
Collaborator Author

@itayx it is OK if the backhaul is wireless. The change is working only for the wired ones.

@vitalii-komisarenko vitalii-komisarenko force-pushed the task/PPM-143-stop-on-failure-attempts branch 2 times, most recently from afb2904 to 381527e Compare August 10, 2020 12:10
@vitalii-komisarenko vitalii-komisarenko marked this pull request as ready for review August 10, 2020 13:42
Copy link
Collaborator

@morantr morantr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the added state and I will approve :)

tests/test_flows.py Show resolved Hide resolved
tests/test_flows.py Show resolved Hide resolved
@vitalii-komisarenko vitalii-komisarenko force-pushed the task/PPM-143-stop-on-failure-attempts branch from 11765be to 768ce92 Compare August 12, 2020 11:10
Comment on lines 4128 to 4131
bool backhaul_manager::is_main_agent() {
auto db = AgentDB::get();
return db->device_conf.local_controller && db->device_conf.local_gw;
}
Copy link
Collaborator

@morantr morantr Aug 12, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this function is redundant.
It is more clear to see the flags directly, and a logical AND of two flags does not justify creating a function.
Also "main_agent" is a little vague and confusing. For example, we use the term "main agent" to describe the process that contains the backhaul manager and the platform manager.
I really think that this commit should be dropped.

Move front radio configuration from "sPlatformSettings" struct
to the AgentDB.
Replace the use of front radio settings on the Agent thread to the one
on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
orenvor and others added 19 commits August 13, 2020 12:13
Move "certification_mode" configuration of from "sPlatformSettings"
struct to the AgentDB.
Replace the use of "certification_mode" settings on the Agent
thread to the one on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Move "stop_on_failure_attempts" configuration from
"sPlatformSettings" struct to the AgentDB.
Replace the use of "stop_on_failure_attempts" settings on the Agent
thread to the one on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Move "client_band_steering_enabled" configuration from
"sPlatformSettings" struct to the AgentDB.
Replace the use of "client_band_steering_enabled" settings on the Agent
thread to the one on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Move "client_optimal_path_roaming_enabled" configuration from
"sPlatformSettings" struct to the AgentDB.
Replace the use of "client_optimal_path_roaming_enabled" settings on the
Agent thread to the one on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Move "dfs_reentry_enabled" configuration from "sPlatformSettings" struct
to the AgentDB.
Replace the use of "dfs_reentry_enabled" settings on the Agent
thread to the one on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Move "client_optimal_path_roaming_prefer_signal_strength_enabled"
configuration from "sPlatformSettings" struct to the AgentDB.
Replace the use of
"client_optimal_path_roaming_prefer_signal_strength_enabled"
settings on the Agent thread to the one on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Move "client_11k_roaming_enabled" configuration from "sPlatformSettings"
struct to the AgentDB.
Replace the use of "client_11k_roaming_enabled" settings on the Agent
thread to the one on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Move "load_balancing_enabled" configuration from "sPlatformSettings"
struct to the AgentDB.
Replace the use of "load_balancing_enabled" settings on the Agent
thread to the one on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Move "service_fairness_enabled" configuration from "sPlatformSettings"
struct to the AgentDB.
Replace the use of "service_fairness_enabled" settings on the Agent
thread to the one on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Move "rdkb_extensions_enabled" configuration of back radio from
"sPlatformSettings" struct to the AgentDB.
Replace the use of "rdkb_extensions_enabled" settings on the Agent
thread to the one on the AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Remove unused "onboarding" configuration from "sPlatformSettings"
struct.

Replace the use of "onboarding" with false value.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Remove of unused fields from sPlatformSettings struct.
Those fields have been moved to the Agent_DB and there for no longer
needed as part of a beerocks massage.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Remove "platform_settings" local variable from son_slave_thread.
Replace the use of "platform_settings" to the settings in AgentDB.

Removing of the config from the tlvf will take place on future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Move "wlan_settings" configuration from son_slave_thread local variable
to the AgentDB.

Currently son_slave receive "wlan_settings" from platform_manager
using "PLATFORM_SON_SLAVE_REGISTER_REQUEST/RESPONSE" messages,
and save the settings in a local variable.

This transfer will be replaced by the use of agent_db to pass and
store "wlan_settings".

Removing of "sWlanSettings" from the tlvf will take place on
future commit.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Remove of unused fields: sPlatformSettings and wlan_settings
from PLATFORM_SON_SLAVE_REGISTER_RESPONSE message.
Those fields have been moved to the Agent_DB and there for no longer
needed as part of this beerocks massage.

PPM-338

Signed-off-by: Oren Vormaser <[email protected]>
Synchronization in test_flows.py is implemented by using timeouts.

They are too short to pass tests on my personal computer,
but they occasionally fail on the CI too.

Added more timeouts to make tests more robust.

Signed-off-by: Vitalii Komisarenko <[email protected]>
https://jira.prplfoundation.org/browse/PPM-143

Disable operational test for repeaters during environmental docker launch.

In certification mode we want to change agents logic to wait for
dev_set_config or WPS PBC in order to proceed to master discovery.

In environmental docker launch subroutine the check for operational state
is done right after launching the containers.

One of the options is changing `test_gw_repeater.sh` in order to send
dev_set_config there, but for sake of simplicity it was chosen to send
dev_set_config in `test_flows.py` (will be introduced in a later commit in this PR).

Signed-off-by: Vitalii Komisarenko <[email protected]>
https://jira.prplfoundation.org/browse/PPM-143

Send dev_set_config to the repeaters as a preparation to
changing agents logic to wait to dev_set_config or WPS PBC
before going to master discovery in certification mode.

Signed-off-by: Vitalii Komisarenko <[email protected]>
@vitalii-komisarenko vitalii-komisarenko force-pushed the task/PPM-143-stop-on-failure-attempts branch from 768ce92 to c34522d Compare August 13, 2020 10:04
@vitalii-komisarenko
Copy link
Collaborator Author

This PR uses changes from PR #1590.

@vitalii-komisarenko vitalii-komisarenko changed the title agent: backhaul_manager: add READY_FOR_MASTER_DISCOVERY state Make agents wait for dev_set_config/WPS PBC before master discovery Aug 13, 2020
@morantr morantr self-requested a review August 13, 2020 11:16
@morantr
Copy link
Collaborator

morantr commented Aug 13, 2020

Approved just make sure not to merge it until #1590 is merged.

tests/test_flows.py Show resolved Hide resolved
}
# Testing for operational status of the repeaters is disabled in PR #1572.
# In certification mode agents shall wait for dev_set_config or WPS PBC,
# so they are inoperational at this stage anyway.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A trade-off solution would have been to redefine the concept of "operational". Some other check should be performed to know if agent is up.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we still check the gateway...
If I am not mistaken, it is used only by test_flow.py. If something goes wrong, the tests will fail.
But yes, it is not ideal.

tests/test_flows.py Show resolved Hide resolved
agent/src/beerocks/slave/agent_ucc_listener.h Show resolved Hide resolved
agent/src/beerocks/slave/agent_ucc_listener.h Show resolved Hide resolved
Copy link
Collaborator

@mariomaz mariomaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments but I let them to your consideration, so approving.

@vitalii-komisarenko vitalii-komisarenko force-pushed the task/PPM-143-stop-on-failure-attempts branch from c34522d to 5346b54 Compare August 14, 2020 14:06
https://jira.prplfoundation.org/browse/PPM-143

In certification mode the state machine of backhaul manager should stop
before entering MASTER_DISCOVERY state until dev_set_config (wired backhaul)
or start_wps_registration (wireless backhaul) is received.

No code change needed for wireless flow.

In case of wired flow the exception is made for the agent running on the same
machine as controller. In that case UCC listener is present only on the controller,
so there is no easy way to send dev_set_config to the agent.

Signed-off-by: Vitalii Komisarenko <[email protected]>
@vitalii-komisarenko vitalii-komisarenko force-pushed the task/PPM-143-stop-on-failure-attempts branch from 5346b54 to de689cf Compare August 14, 2020 14:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants