diff --git a/.github/ISSUE_TEMPLATE/workgroup-request.md b/.github/ISSUE_TEMPLATE/workgroup-request.md index 413d8453b30..478db9300f3 100644 --- a/.github/ISSUE_TEMPLATE/workgroup-request.md +++ b/.github/ISSUE_TEMPLATE/workgroup-request.md @@ -13,7 +13,7 @@ assignees: lucyhyde **Working Group Details** **--Working Group Name:** diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index ee52999ac06..bb24a5b3684 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -36,7 +36,7 @@ For example: ## Expectations for pull requests Pull requests should be free of any known bugs and be accompanied by tests and appropriate documentation. Test coverage may include unit tests, integration tests such as [PTF tests](https://github.com/sonic-net/SONiC/wiki/HOWTO-write-a-PTF-Test) defined in the [sonic-mgmt repo](https://github.com/sonic-net/sonic-mgmt/tree/master/ansible/roles/test/tasks). -## Commiting new test +## Committing new test When committing a new feature with a new test, please complete a [test plan from the template](doc/SONiC%20Test%20Plan%20Template.md) diff --git a/doc/BGP/BGP-route-aggregation-with-bbr-awareness.md b/doc/BGP/BGP-route-aggregation-with-bbr-awareness.md index 0f8c14aaefa..4d5fac61774 100644 --- a/doc/BGP/BGP-route-aggregation-with-bbr-awareness.md +++ b/doc/BGP/BGP-route-aggregation-with-bbr-awareness.md @@ -18,7 +18,7 @@ - [CLI Design](#cli-design) - [Show CLI](#show-cli) - [Config CLI](#config-cli) -- [Feature limitaion](#feature-limitaion) +- [Feature limitation](#feature-limitation) ## Revision @@ -46,7 +46,7 @@ This document describes how to leverage the SONiC config DB to add or remove BGP ## Overview In BGP, we can aggregate contributing routes into one single aggregated route. It has many advantages, for example reducing routes’ count. -However, firstly, SONiC can’t configurate aggregated addresses via config DB and doesn’t have CLI support for it. +However, firstly, SONiC can’t configure aggregated addresses via config DB and doesn’t have CLI support for it. Secondly, if we aggregated routes without BBR feature on device, we many got packet drop on this device due to contributing routes missing. @@ -205,7 +205,7 @@ It's a string which should be the name of a prefix list, and the aggregate addre It's a string which should be the name of a prefix list, and the aggregate address will be append to the prefix list with prefix length filter to filter prefixes whose length greater or equal to the prefix length of the aggregate address. ### State DB Extension -For every aggregated address, we track its state in state DB, it has two states active and inactive. Active state means the address is configurated in the bgp container, while inactive state means isn't. +For every aggregated address, we track its state in state DB, it has two states active and inactive. Active state means the address is configured in the bgp container, while inactive state means isn't. #### State DB sample: ```json @@ -359,5 +359,5 @@ Then we will implement test in sonic-mgmt repo to test if this feature works inc 3. Remove aggregated address in config db and check whether the address will be removed from state db and bgp container. 4. More tests details will be published in sonic-mgmt repo. -## Feature limitaion -In CLOS network, aggregate deployment without simultaneous operation on all devices in same layer could lead to traffic imbalance due to traffic prefers go to detail routes, please only use this feature in traffic insensitve scenario or ensure deploy aggregate routes on all devices in same layer simultaneously. \ No newline at end of file +## Feature limitation +In CLOS network, aggregate deployment without simultaneous operation on all devices in same layer could lead to traffic imbalance due to traffic prefers go to detail routes, please only use this feature in traffic insensitive scenario or ensure deploy aggregate routes on all devices in same layer simultaneously. \ No newline at end of file diff --git a/doc/BGP/BGP-router-id.md b/doc/BGP/BGP-router-id.md index d4ed83027b3..41b8eb4a77b 100644 --- a/doc/BGP/BGP-router-id.md +++ b/doc/BGP/BGP-router-id.md @@ -32,7 +32,7 @@ This document describes a mechanism to allow user explicitly configure BGP route ### Overview Currently, there are some BGP hard codings in SONiC: -1. BGP router id was defined as a 32-bit value that uniquely identifies a BGP device. In single-asic device, SONiC uses Loopback0 IPv4 address as BGP router id. In mult-asic and uses Loopback4096 IPv4 address as BGP router id (for both iBGP and eBGP). This coupling prevents users from using customized router id. If IPv4 address of Loopback0 / Loopback4096 don't exist, BGP router id wouldn't be configured, then FRR would choose the largest IP address in device to be BGP router id. If the router id choosen by FRR is not unique, it would be considered an error. +1. BGP router id was defined as a 32-bit value that uniquely identifies a BGP device. In single-asic device, SONiC uses Loopback0 IPv4 address as BGP router id. In mult-asic and uses Loopback4096 IPv4 address as BGP router id (for both iBGP and eBGP). This coupling prevents users from using customized router id. If IPv4 address of Loopback0 / Loopback4096 don't exist, BGP router id wouldn't be configured, then FRR would choose the largest IP address in device to be BGP router id. If the router id chosen by FRR is not unique, it would be considered an error. 2. In single-asic device, SONiC wouldn't add BGP peer when there is not Loopback0 IPv4 exists. In multi-asic, SONiC wouldn't add eBGP peer when there is not Loopback0 IPv4 exists. Below is current workflow about BGP and router id in single-asic, only includes contents related to Loopback0. @@ -67,7 +67,7 @@ Add support to allow user explicitly configure BGP router id. 2 aspects enhancement: -1. Add a field `bgp_router_id` in `CONFIG_DB["DEVICE_METADATA"]["localhost"]` to support explicitly configure BGP router id. For multi-asic devices, this configuraion would be added to correspond config_db for each asic. If `CONFIG_DB["DEVICE_METADATA"]["localhost"]["bgp_router_id"]` configured, always use it as BGP router id. With this change, the new BGP router id configuration behavior will be like follow table. To be clarified that when bgp_router_id doesn't be configured, the behavior is totally same as previously. +1. Add a field `bgp_router_id` in `CONFIG_DB["DEVICE_METADATA"]["localhost"]` to support explicitly configure BGP router id. For multi-asic devices, this configuration would be added to correspond config_db for each asic. If `CONFIG_DB["DEVICE_METADATA"]["localhost"]["bgp_router_id"]` configured, always use it as BGP router id. With this change, the new BGP router id configuration behavior will be like follow table. To be clarified that when bgp_router_id doesn't be configured, the behavior is totally same as previously. | | Loopback0/Loopback4096 IPv4 address exists | Loopback0/Loopback4096 IPv4 address doesn't exist | |--------------|-------|------------| diff --git a/doc/BGP/BGP-supress-fib-pending.md b/doc/BGP/BGP-supress-fib-pending.md index 007ceb724e9..f05530b794c 100644 --- a/doc/BGP/BGP-supress-fib-pending.md +++ b/doc/BGP/BGP-supress-fib-pending.md @@ -219,7 +219,7 @@ module sonic-device_metadata { /* end of module sonic-device_metadata ``` -This knob can only be set to ```"enable"``` when syncrhonous SAI configuration mode is on. This constraint is guaranteed by the ```must``` expression for this leaf. +This knob can only be set to ```"enable"``` when synchronous SAI configuration mode is on. This constraint is guaranteed by the ```must``` expression for this leaf. #### 6.4. CLI @@ -295,7 +295,7 @@ B>q 0.0.0.0/0 [20/0] via 10.0.0.1, PortChannel102, weight 1, 00:04:46 q via 10.0.0.13, PortChannel1011, weight 1, 00:04:46 ``` -Once zebra is notified about successfull route offload status the *offload* flag is set: +Once zebra is notified about successful route offload status the *offload* flag is set: ``` admin@sonic:~$ vtysh -c "show ip route 100.1.0.25/32 json" @@ -590,7 +590,7 @@ sequenceDiagram deactivate APPL_DB alt Suppression is disabled - Note left of RouteSync: Reply back immidiatelly
when this feature is disabled + Note left of RouteSync: Reply back immediately
when this feature is disabled RouteSync ->> zebra: Send back RTM_NEWROUTE
with RTM_F_OFFLOAD set zebra ->> RouteSync:
end diff --git a/doc/BGP/Bgpcfgd-dyn-peer-modification-support.md b/doc/BGP/Bgpcfgd-dyn-peer-modification-support.md index 3704bec02e0..91ead9385f3 100644 --- a/doc/BGP/Bgpcfgd-dyn-peer-modification-support.md +++ b/doc/BGP/Bgpcfgd-dyn-peer-modification-support.md @@ -122,7 +122,7 @@ The following modifications will be made to bgpcfgd to support the new scenario: ``` 4. Note that the default behavior when no update template is defined, is one where nothing executes during update peer operations, thereby making this change fully backward compatible and requiring no breaking changes in terms of templates for users of bgpcfgd. 4. We expose similar logic as listed in 1, 2 for a delete handling, ie we add a delete template under self.templates["delete"] if such a template exists in the directory structure -5. Upon a delete peer ocurring, we render the delete template(instead of executing the current behavior of ```no neighbor {{ neighbor addr}}```), on the other hand if a delete template is not defined then the default behavior of ```no neighbor {{ neighbor addr}}``` applies as usual, thereby making this change backward compatible. +5. Upon a delete peer occurring, we render the delete template(instead of executing the current behavior of ```no neighbor {{ neighbor addr}}```), on the other hand if a delete template is not defined then the default behavior of ```no neighbor {{ neighbor addr}}``` applies as usual, thereby making this change backward compatible. 6. Bgpcfgd will write a State DB entry per the schema from section 2.2. This will be utilized by the SDN controller to identify what has been processed by bgpcfgd in terms of configuration. ## 2.5 Docker-FPM-FRR bgpd template: diff --git a/doc/California-SB237/California-SB237.md b/doc/California-SB237/California-SB237.md index 91574a6606a..36a01d69b86 100644 --- a/doc/California-SB237/California-SB237.md +++ b/doc/California-SB237/California-SB237.md @@ -25,7 +25,7 @@ * 1.14. [Test Plan](#TestPlan) * 1.14.1. [Unit Test cases](#UnitTestcases) * 1.14.2. [System Test cases](#SystemTestcases) - * 1.14.3. [Pasword Change Flow](#PasswordChangeFlow) + * 1.14.3. [Password Change Flow](#PasswordChangeFlow) * 1.15. [3rd Party Components](#rdPartyComponents) * 1.15.1. [PW Force Expiration](#WForceExpire) * 1.15.2. [Pam Unix](#PAMUNIX) @@ -143,7 +143,7 @@ Not relevant ### 1.11. Warmboot and Fastboot Design Impact -The feature can be triggered after sonic upgrade and warm reboot and feature doesn't affect trafic. +The feature can be triggered after sonic upgrade and warm reboot and feature doesn't affect traffic. ### 1.12. Restrictions/Limitations The California law feature is not supported on remote AAA. @@ -169,7 +169,7 @@ Check affecting password hardening feature: Check password hardening age is not affected - #### 1.14.3. Pasword Change Flow + #### 1.14.3. Password Change Flow Example of password change during 1st login. diff --git a/doc/DHCPv6_Relay/DHCPv6_Relay_HLD.md b/doc/DHCPv6_Relay/DHCPv6_Relay_HLD.md index b2220fbe225..00e2c723a16 100644 --- a/doc/DHCPv6_Relay/DHCPv6_Relay_HLD.md +++ b/doc/DHCPv6_Relay/DHCPv6_Relay_HLD.md @@ -55,7 +55,7 @@ This document describes the high level design of the DHCP Relay for IPv6 feature DHCP Relay for IPv6 feature in SONiC should meet the following high-level functional requirements: - Give the support for relaying DHCP packets from downstream networks to upstream networks using IPv6 addresses. -- Provide the functionality as a seperate process running on dhcp-relay docker container. +- Provide the functionality as a separate process running on dhcp-relay docker container. - Relaying messages to multiple unicast and multicast addresses. ## 1.2 Configuration and Management Requirements diff --git a/doc/DHCPv6_relay/DHCPv6-relay-agent-High-Level-Design.md b/doc/DHCPv6_relay/DHCPv6-relay-agent-High-Level-Design.md index 6525ae50e3e..715d7b8f6ad 100644 --- a/doc/DHCPv6_relay/DHCPv6-relay-agent-High-Level-Design.md +++ b/doc/DHCPv6_relay/DHCPv6-relay-agent-High-Level-Design.md @@ -41,7 +41,7 @@ DUID: DHCP Unique Identifier (Each DHCPv6 client and server has a DUID. DHCPv6 s SONiC currently supports DHCPv4 Relay via the use of open source ISC DHCP package. However, DHCPv6 specification does not define a way to communicate client link-layer address to the DHCP server where DHCP server is not connected to the same network link as DHCP client. DHCPv6 requires all clients prepare and send a DUID as the client identifier in all DHCPv6 message exchanges. However, these methods do not provide a simple way to extract a client's link-layer address. Providing option 79 in DHCPv6 Relay-Forward messages will help carry the client link-layer address explicitly. The server needs to know the client's MAC address to allow DHCP Reservation, which provides pre-set IP address to specific client based on its physical MAC address. The DHCPv6 relay agent is able to read the source MAC address of DHCPv6 messages that it received from client, and encapsulate these messages within a DHCPv6 Relay-Forward message, inserting the client MAC address as option 79 in the Relay-Forward header sent to the server. -With heterogenous DHCP client implementation across the network, DUIDs could not resolve IP resource tracking issue. The two types of DUIDs, DUID-LL and DUID-LLT used to facilitate resource tracking both have link layer addresses embedded. The current client link-layer address option in DHCPv6 specification limits the DHCPv6 Relay to first hop to provide the client link layer address, which are relay agents that are connected to the same link as the client, and that limits SONiC DHCPv6 deployment to ToR/MoR switches for early stages. One solution would be to provide SONiC's own DHCPv6 relay agent feature. ISC DHCP currently has no support for option 79. Configuration wise, using ISC DHCP configuration requires restarting container as configuration is provided through the commandline. The plan is to eventually move away from ISC DHCP configuration, which is fairly complex, and provide SONiC's own configuration. +With heterogeneous DHCP client implementation across the network, DUIDs could not resolve IP resource tracking issue. The two types of DUIDs, DUID-LL and DUID-LLT used to facilitate resource tracking both have link layer addresses embedded. The current client link-layer address option in DHCPv6 specification limits the DHCPv6 Relay to first hop to provide the client link layer address, which are relay agents that are connected to the same link as the client, and that limits SONiC DHCPv6 deployment to ToR/MoR switches for early stages. One solution would be to provide SONiC's own DHCPv6 relay agent feature. ISC DHCP currently has no support for option 79. Configuration wise, using ISC DHCP configuration requires restarting container as configuration is provided through the commandline. The plan is to eventually move away from ISC DHCP configuration, which is fairly complex, and provide SONiC's own configuration. # DHCPv6 diff --git a/doc/Dump-Utility.md b/doc/Dump-Utility.md index 5ed9398dc36..b87775ddbb4 100644 --- a/doc/Dump-Utility.md +++ b/doc/Dump-Utility.md @@ -75,7 +75,7 @@ dump state port all 3) This argument could either be a table-key or a unique-field-value present in either Conf DB or Appl DB. * Eg: For PORT, the second argument will be an interface name i.e 'Ethernet128' which is a table-key. On the other hand, the secondary argument for COPP will be a trap_id such as 'arp_req', which is a field-value and not a key of any table. 4) The decision of what to pass as a secondary argument lies with the discretion of the one who is writing the module. -6) The Command should also take a list of comma seperated inputs for the secondary argument +6) The Command should also take a list of comma separated inputs for the secondary argument 7) The Command should also accept an "all" value and which means it should print the unified view for every entry related to that feature. ``` @@ -644,7 +644,7 @@ class Port(Executor): 2) MatchEngine / MatchRequest: Provided to abstract the heavy lifting in fetching the required data from redis-db/config-files. More info in the next section. 3) verbose_print(str_): prints to the stdout based on verbosity provided by the user. 4) handle_error(err_str, excep=False): Prints the error output to stdout, if any experienced by the module, Set excep = True, to raise an exception -5) handle_multiple_keys_matched_error(err_str, key_to_go_with="", excep=False): When a filtering criteria specified by the module matches multiple keys, wherein it is expected to match ony one, this method can be used. +5) handle_multiple_keys_matched_error(err_str, key_to_go_with="", excep=False): When a filtering criteria specified by the module matches multiple keys, wherein it is expected to match only one, this method can be used. ``` ### 2.4 Match Infrastructure @@ -672,7 +672,7 @@ To Abstract this functionality out, a MatchEngine class is created. A MatchReque "just_keys": "true|false" # Mandatory, if true, Only Returns the keys matched. Does not return field-value pairs. Defaults to True "ns" : DEFAULT_NAMESPACE # namespace argument, if nothing is provided, default namespace is used "match_entire_list" : False # Some of the fields in redis can consist of multiple values eg: trap_ids = "bgp,bgpv6,ospf". - When this arg is set to true, entire list is matched incluing the ",". + When this arg is set to true, entire list is matched including the ",". When False, the values are split based on "," and individual items are matched with } ``` @@ -995,7 +995,7 @@ admin@sonic:~$ dump state acl_rule 'DATA_L3|R1' -t | S.No | Test case synopsis | |------|-----------------------------------------------------------------------------------------------------------------------------------------| -| 1 | Verify MatchEngine funtionality in cases of invalid Request Objects | +| 1 | Verify MatchEngine functionality in cases of invalid Request Objects | | 2 | Verify MatchEngine Match functionality is as expected | | 3 | Verify all the options in the CLI is working as expected | | 4 | Verify the namespace arg is working as expected | diff --git a/doc/L3_performance_and_scaling_enchancements_HLD.md b/doc/L3_performance_and_scaling_enchancements_HLD.md index c01435cb374..db31583ee4a 100644 --- a/doc/L3_performance_and_scaling_enchancements_HLD.md +++ b/doc/L3_performance_and_scaling_enchancements_HLD.md @@ -157,7 +157,7 @@ To measure route programming time, BGP routes were advertised to a SONiC router - If the lookup in the local link cache fails, fpmsyncd updates cache by getting the configured links from the kernel. The problem here is if there are no VNET present on the system, this lookup will always fail and cache is updated for every route. - This seems to slow down the rate which route programed for the global route table. + This seems to slow down the rate which route programmed for the global route table. To fix this we will skip the lookup for the Master device name if the route object table value is zero .i.e. the route needs to put in the global routing table @@ -176,18 +176,18 @@ To measure route programming time, BGP routes were advertised to a SONiC router With the above mentioned optimizations we target to get 30% reduction in the route programming time in SONiC. -### 3.3 show CLI command enchancements +### 3.3 show CLI command enhancements #### 3.3.1 show arp/ndp command improvement. -The current implemetation of the cli script for "show arp" or "show ndp" fetches the whole FDB table to get the outgoing interface incase the L3 interface is a VLAN L3 Interface. +The current implementation of the cli script for "show arp" or "show ndp" fetches the whole FDB table to get the outgoing interface in case the L3 interface is a VLAN L3 Interface. This slows down the show command. We will make changes to the CLI script to get FDB entries only for this specific ARP/ND instead of getting the whole FDB table. These changes will improve the performance of the command significantly # 4 Warm Boot Support -No specific changes are planned for Warm boot support as these are exisiting features. +No specific changes are planned for Warm boot support as these are existing features. However, testing will done to make sure the changes done, for scaling or performance improvements, won't affect the Warm boot functionality. @@ -203,7 +203,7 @@ However, testing will done to make sure the changes done, for scaling or perform | 4. | Verify 128k IPv4 routes are installed and measure the route programming time | | | | 5. | Verify 160k IPv4 routes are installed and measure the route programming time | | | | 6. | Verify 200k IPv4 routes are installed and measure the route programming time | | | -| 7. | Verfiy 8k IPv4 ARP entries are learnt and measure the learning time | | | +| 7. | Verify 8k IPv4 ARP entries are learnt and measure the learning time | | | | 8. | Verify 16k IPv4 ARP entries are learnt and measure the learning time | | | | 9. | Verify 32k IPv4 ARP entries are learnt and measure the learning time | | | ## 5.2 IPv6 Testcases @@ -212,10 +212,10 @@ However, testing will done to make sure the changes done, for scaling or perform | 1. | Verify 10k IPv6 routes with prefix >64b are installed and measure the route programming time | | | | 2. | Verify 25k IPv6 routes with prefix > 64b are installed and measure the route programming time | | | | 3. | Verify 40k IPv6 routes with prefix > 64b are installed and measure the route programming time | | | -| 4. | Verfiy 8k IPv6 ND entries are learnt and measure the learning time | | | +| 4. | Verify 8k IPv6 ND entries are learnt and measure the learning time | | | | 5. | Verify 16k IPv6 ND entries are learnt and measure the learning time | | | | | | | | -## 5.3 Regresssion Testcases +## 5.3 Regression Testcases | Testcase number | Testcase | Result | Time taken | | --------------- | ----------------------------------------------------------- | ------ | ---------- | | 1. | Measure the convergence time with link flaps | | | diff --git a/doc/MSTP/MSTP.md b/doc/MSTP/MSTP.md index 496bf76cf59..a30a0e67ed1 100644 --- a/doc/MSTP/MSTP.md +++ b/doc/MSTP/MSTP.md @@ -35,8 +35,8 @@ - [Instance deletion](#instance-deletion) - [MSTP Instance creation](#mstp-Instance-Creation) - [MSTP Instance deletion](#mstp-Instance-Deletion) - - [Add VLAN to Exisiting Instance](#add-vlan-existing) - - [Delete VLAN to Exisiting Instance](#delete-vlan-existing) + - [Add VLAN to Existing Instance](#add-vlan-existing) + - [Delete VLAN to Existing Instance](#delete-vlan-existing) - [Add VLAN member](#add-vlan-member) - [Del VLAN member](#del-vlan-member) * [Configuration Commands](#configuration-commands) @@ -330,7 +330,7 @@ mst_boundary_proto = BIT ; 'enabled' or 'disabled' #### STP_INST_PORT_FLUSH_TABLE ``` ;Defines instance and port for which FDB Flush needs to be performed -key = STP_INST_PORT_FLUSH_TABLE:instane:ifname ; FDB Flush instance id and port name +key = STP_INST_PORT_FLUSH_TABLE:instance:ifname ; FDB Flush instance id and port name state = "true" ``` @@ -404,7 +404,7 @@ The MSTP standard does not support UplinkFast, so this functionality will be dis ## Add VLAN to Existing Instance ![MSTP VLAN Add](images/MSTP_Add_ExistingInstance.drawio.png) -## Del VLAN from Exisiting Instance +## Del VLAN from Existing Instance ![MSTP VLAN Del](images/MSTP_Del_ExistingInstance.drawio.png) # Configuration Commands @@ -473,13 +473,13 @@ The existing interface-level STP commands as below will re-used for configuring - Configure an interface for MSTP. - **config spanning_tree interface edgeport {enable|disable} \** - - This command enables or disables the Edge Port feature on the speficied interface. + - This command enables or disables the Edge Port feature on the specified interface. - **config spanning_tree interface bpdu_guard {enable|disable} \** - - This command enables or disables the BPDU Guard feature on the speficied interface. + - This command enables or disables the BPDU Guard feature on the specified interface. - **config spanning_tree interface root_guard {enable|disable} \** - - This command enables or disables the Root Guard feature on the speficied interface. + - This command enables or disables the Root Guard feature on the specified interface. - **config spanning_tree interface priority \ \** - Set the specified port level priority for the specified interface in seconds. diff --git a/doc/Query_Stats_Capability/Query_Stats_Capability_HLD.md b/doc/Query_Stats_Capability/Query_Stats_Capability_HLD.md index 440447a64ed..e1e55d124c4 100644 --- a/doc/Query_Stats_Capability/Query_Stats_Capability_HLD.md +++ b/doc/Query_Stats_Capability/Query_Stats_Capability_HLD.md @@ -1,7 +1,7 @@ -# Query Stats Capability new SAI API indroduction +# Query Stats Capability new SAI API introduction # Table of Contents -- [Query Stats Capability new SAI API indroduction](#query-stats-capability-new-sai-api-indroduction) +- [Query Stats Capability new SAI API introduction](#query-stats-capability-new-sai-api-introduction) - [Table of Contents](#table-of-contents) - [Introduction and Motivation](#introduction-and-motivation) - [Requirements](#requirements) @@ -31,7 +31,7 @@ But, it will require all vendors either implement the API or return SAI_STATUS_N # Code example -- Current implemenation: +- Current implementation: ``` for (int id = SAI_PORT_STAT_IF_IN_OCTETS; id <= SAI_PORT_STAT_IF_OUT_FABRIC_DATA_UNITS; ++id) { @@ -49,7 +49,7 @@ for (int id = SAI_PORT_STAT_IF_IN_OCTETS; id <= SAI_PORT_STAT_IF_OUT_FABRIC_DATA } ``` -- New implemenation: +- New implementation: ``` sai_stat_capability_list_t stats_capability; stats_capability.count = 0; diff --git a/doc/SONIC_Test_Ingress_Discards_HLD.md b/doc/SONIC_Test_Ingress_Discards_HLD.md index c20e0f574bc..7fc80ec8813 100644 --- a/doc/SONIC_Test_Ingress_Discards_HLD.md +++ b/doc/SONIC_Test_Ingress_Discards_HLD.md @@ -96,8 +96,8 @@ Please refer to the test case for detailed description. | sonic-clear counters | Clear counters | | sonic-clear rifcounters | Clear RIF counters | -As different vendors can have diferent drop counters calculation, for example L2 and L3 drop counters can be combined and L2 drop counter will be increased for all ingress discards. -So for valid drop counters verification there is a need to distinguish wheter drop counters are combined or not for current vendor. +As different vendors can have different drop counters calculation, for example L2 and L3 drop counters can be combined and L2 drop counter will be increased for all ingress discards. +So for valid drop counters verification there is a need to distinguish whether drop counters are combined or not for current vendor. This can be done by checking platform name of the DUT. ##### Work need to be done based on this case diff --git a/doc/SONiC-User-Manual.md b/doc/SONiC-User-Manual.md index b157f62ec0d..23a6dd6b3b3 100644 --- a/doc/SONiC-User-Manual.md +++ b/doc/SONiC-User-Manual.md @@ -659,7 +659,7 @@ Following "show" commands can be used to check the port status. - Show interface status ( up/down) - Show interface transceiver presence -Following "redis-dump" command can be used to dump the port configuraiton from the ConfigDB. +Following "redis-dump" command can be used to dump the port configuration from the ConfigDB. Example : ``` diff --git a/doc/SONiC_local_users_password_reset_hld.md b/doc/SONiC_local_users_password_reset_hld.md index 61a30eeb0bc..d22854e5d83 100644 --- a/doc/SONiC_local_users_password_reset_hld.md +++ b/doc/SONiC_local_users_password_reset_hld.md @@ -78,7 +78,7 @@ The service is dependent on the ```database.service``` for reading the configure This feature will be a built-in SONiC feature. There will be three main files to consider in this feature: 1. ```src/sonic-platform-common/sonic_platform_base/reset_local_users_passwords_base.py``` - The default behavior implementation will reside in this file, including when to trigger the feature and how to reset the local users' configurations. -2. ```platform//sonic_platform/reset_local_users_passwords.py``` - The vendor specifc implementation of the feature, each vendor can decide what is the trigger cause to start the functionality and how it is implemented. +2. ```platform//sonic_platform/reset_local_users_passwords.py``` - The vendor specific implementation of the feature, each vendor can decide what is the trigger cause to start the functionality and how it is implemented. 3. ```src/sonic-host-services/scripts/reset-local-users-passwords``` - The python file that will be called on service start during init that imports the vendor's specific implementation. The default behavior will delete non-default users and restore the original passwords of the default users and expire them based on the content of the file of ```/etc/sonic/default_users.json``` on long reboot press. @@ -144,7 +144,7 @@ The ```LocalUsersConfigurationResetPlatform``` class residing in ```platform/ matches; - string action; // array? - }; - - Struct/class AclTable { - sai_object_id_t saiId; - string table_id; - string description; // needed? - table_type_t m_type; - vector m_rules; - }; - - class AclOrch : public Orch { - void doTask(); - void doAclTableTask(); - void doAclRuleTask(); - ... - vector m_AclTables; - } -``` -This class will be responsible for: - -- processing updates of the ACL tables (create/delete/update) -- partial input data (App DB) validation (including cross-table validation) -- replicating ACL data from the App DB to the SAI DB via SAIRedis -- caching of the ACL objects in order to detect objects update and perform state dump. -#### 3.1.3.2 Acl Table Create or Delete -AclOrch class will inherit and reuse Orch class functionality which exploits producer-consumer mechanism (implemented in swss-common) to track changes in the Redis database tables. ACL Tables are stores under ACL_TABLE:* keys in App DB. On ACL_TABLE update in the App DB AclOrch::doAclTableTask() will be called to process the change. On table create AclOrch will verify if the table already exists (using table_id) creating of the table which already exists will be processed as update. Regular create or delete will update the internal class structures and appropriate SAI objects will be created or deleted. -Validation: on create validate table type. -#### 3.1.3.3 Acl Rule Create or Delete -ACL Rules are stores under ACL_RULE_TABLE:* keys in App DB. On ACL_RULE_TABLE update in the App DB AclOrch::doAclRuleTask() will be called to process the change. On table create AclOrch will verify if the rule already exists (using rule_id) creating of the rule which already exists will be processed as update. Regular create or delete will update the internal class structures and appropriate SAI objects will be created or deleted. -Validation: make sure the table exists, the list of match criterias is valid and fits the table, the list of actions is valid. -### 3.1.4 SAI Redis -No updates in Phase 1. -### 3.1.5 SAI DB -No updates in Phase 1. -### 3.1.6 syncd -No updates in Phase 1. -### 3.1.7 General updates -Add definitions for the table names "ACL_TABLE" and "ACL_RULE_TABLE" to the schema.h -## 3.2 Phase 2 -### 3.2.1 Orchestration Agent -#### 3.2.1.1 Counters -Add handling of counter action for tables and rules. This assumes automatic counter object creation and adding it to each rule on create and removing on delete. -```c++ - struct AclRule { - sai_object_id_t saiId; - sai_object_id_t counter_oid; - string rule_id; - map matches; - string action; // array? - }; -``` -There will counters to register number of packets and number of bytes processed by the rule. -Counters will be stored to the DB #2 with the predefined period. Update period will be hard coded. The default value will be 10 seconds. -DB Schema for ACL counters is the following: - - COUNTERS:ACL_TABLE_NAME:ACL_RULE_NAME - Packets : - Bytes : - -#### 3.2.1.2 ACL Table Update -If an update refers the table which already exists, this change will be considered as update. This will cause updating of internal records as well as corresponding SAI objects. Updating SAI objects may require recreating them. -#### 3.2.1.3 ACL Rule Update -If an update refers the rule which already exists, this change will be considered as update. This will cause updating of internal records as well as corresponding SAI objects. -Validation: similar to the one performed on create. -#### 3.2.1.4 Configuration update -Besides strait forward "delete-create" way of update need to consider performing "safe update" when a new configuration will be created prior to removing the old one. And switch to the new configuration only if it is successfully created. This will require resolving at least two issues: - -- need to be sure there are enough hardware resources to hold both old and new configurations -- update should be "atomic". I.e. Orchestration Agent should receive an entire update before starting an update. - -Validation: similar to the one performed on create. - -## 3.3 Phase 3 -In Phase 3 there will be implemented ACL Ranges support and ACLTable to port binding. -### 3.3.1 Orchestration Agent -#### 3.3.1.1 ACL Ranges support: -In Orchestration Agent in class AclOrch: -```c++ - struct AclRange { - sai_object_id_t saiId; - tuple range; - } - - struct AclCounter { - sai_object_id_t saiId; - } - - struct AclRule { - sai_object_id_t saiId; - string rule_id; - map matches; - string action; // array? - AclCounter byteCounter; - AclCounter packetCounter; - }; - - struct/class AclTable { - sai_object_id_t saiId; - string table_id; - string description; // needed? - table_type_t m_type; - vector m_ports; - vector m_rules; - }; - - class AclOrch : public Orch { - void doTask(); - void doAclTableTask(); - void doAclRuleTask(); - ... - vector m_AclTables; - map , AclRange> m_AclRanges; - } -``` -Add handling, caching and validation of range matching. This also includes detecting and reusing of identical ranges in order to save hardware resources. - -#### 3.3.1.2 Binding ACL Table to Port -While declaring ACL table in a json config file it is mandatory to specify a port or the list of ports this table will be bound to. Starting from the SAI v1.0 multiple tables cannot be bound to one port. To implement this feature tables first have to be added to a group and then group could be bound to the port. -Groups will be created and managed by Ports (`class Port`, implemented in `orchagent/port.cpp`). `PortsOrch` class API will be extended with the method `getPort` to return an appropriate Port class instance. The Port class will provide method `bindAclTable` which will handle creation of the group, binding this group to the port and adding given ACL table to the corresponding group. - -Code sample which binds table to the port: - - sai_status_t AclOrch::bindAclTable(sai_object_id_t table_oid,..) - { - for (const auto& portOid : aclTable.ports) - { - Port port; - gPortsOrch->getPort(portOid, port); - - sai_object_id_t group_member_oid; - status = port.bindAclTable(group_member_oid, table_oid); - ... -If LAG port not created yet when bind ACL table to it, LAG port will be added to an internal pending port list, after LAG port created, AclOrch will get notification from STATE_DB, and will bind the ACL table to the LAG port. This is implemented by adding a "doAclTablePortUpdateTask" to handle the port configured notification from STATE_DB. - -#### 3.3.1.3 ACL and LAG - -- LAG member port shall not be added to the ACL Tables, or will be considered as invalid configuration and return fail. -- LAG ACL configurations will be automatically applied to all the LAG members, this is done by SAI/SDK. - -#### 3.3.1.3 ACL mirroring -```c++ - class AclRule - { - public: - AclRule(AclOrch *aclOrch, string rule, string table); - virtual bool validateAddPriority(string attr_name, string attr_value); - virtual bool validateAddMatch(string attr_name, string attr_value); - virtual bool validateAddAction(string attr_name, string attr_value) = 0; - virtual bool validate() = 0; - bool processIpType(string type, sai_uint32_t &ip_type); - - virtual bool create(); - virtual bool remove(); - virtual void update(SubjectType, void *) = 0; - - string getId() - { - return id; - } - - string getTableId() - { - return table_id; - } - - sai_object_id_t getCounterOid() - { - return counter_oid; - } - - static shared_ptr makeShared(acl_table_type_t type, AclOrch *acl, MirrorOrch *mirror, string rule, string table); - virtual ~AclRule() {}; - - protected: - virtual bool createCounter(); - virtual bool removeCounter(); - - AclOrch *aclOrch; - string id; - string table_id; - sai_object_id_t table_oid; - sai_object_id_t rule_oid; - sai_object_id_t counter_oid; - uint32_t priority; - map matches; - map actions; - }; - - class AclRuleL3: public AclRule - { - public: - AclRuleL3(AclOrch *aclOrch, string rule, string table); - - bool validateAddAction(string attr_name, string attr_value); - bool validate(); - void update(SubjectType, void *); - }; - - class AclRuleMirror: public AclRule - { - public: - AclRuleMirror(AclOrch *aclOrch, MirrorOrch *mirrorOrch, string rule, string table); - bool validateAddAction(string attr_name, string attr_value); - bool validate(); - bool create(); - bool remove(); - void update(SubjectType, void *); - AclRuleCounters getCounters(); - - protected: - bool m_state; - string sessionName; - acl_stage_type_t m_tableStage; - AclRuleCounters counters; - MirrorOrch *m_pMirrorOrch; - }; - - struct AclTable { - string id; - string description; - acl_table_type_t type; - ports_list_t ports; - // Map rule name to rule data - map> rules; - AclTable(): type(ACL_TABLE_UNKNOWN) {} - }; -``` -To support mirror action bind to both ingress and egress ACL rule, an member "acl_stage_type_t m_tableStage" added -to class AclRuleMirror to indicate the stage the ACL mirror rule, according to the stage can select proper mirror -action, "SAI_ACL_ENTRY_ATTR_ACTION_MIRROR_INGRESS" for ingress ACL rule, "SAI_ACL_ENTRY_ATTR_ACTION_MIRROR_EGRESS" -for egress ACL rule. -Add possibility to receive updates about mirror sessions state change and perform mirroring rules state change accordingly. -# 4 Flows -## 4.1 Creating of ACL Objects -![](acl_create.png) -## 4.2 Deleting of ACL Objects -![](acl_delete.png) -## 4.3 Updating of ACL Objects -Depending on the number of changed properties in the updated ACL object, update may include one or more extra delete/create calls to the SAI Redis. -![](acl_update.png) -## 4.4 Creating of ACL Mirror rules -![](acl_mirror_rule_flow.svg) -## 4.5 Deleting of ACL Mirror rules -![](mirror_delete.png) -## 4.6 Mirror state change handling -![](mirror_state_change.png) -# 5 swssconfig input file format and restrictions -- Valid json file. The file should be in the format swssconfig can process. This assumes lists surrounded by square brackets, dictionaries with curly brackets (braces), tuples inside dictionary separated with semicolon and enumerated elements separated with the comma. -- Logical consistency. The configuration provided should be complete. Rules should not refer non-existing tables, etc. -- Order: Tables should appear before Rules. -- The list of keywords to be used to address different match criterias and actions provided in Appendix A -- Rules should have at least one match criteria and one action -- List of ports to bind to the table should contain physical port names. -- Maximum number of rules allowed: 1000 rules total in the all "L3" tables and 256 rules total in all "Mirror" tables. -See json file example is in Appendix B. -# 6 Testing -## 6.1 Testing environment -Ansible + PTF -## 6.2 List of tests to cover basic functionality -- simple permit (any) -- simple deny (any) -- permit/deny with matching (IP, port, ethertype, etc) -## 6.3 Additional tests for Pase 2/3 -- permit/deny and counter -- permit/deny with range -- permit/deny with two ranges (src, dst) - -# Appendix A:Keywords for matches and actions -###### **Table 8: Json file keywords** -|Keyword | Description| -|------------|------------| -|policy_desc | ACL Table property, contains human readable table description string -|type | ACL Table property. Could be "L3" or "Mirror" -|ports | ACL Table property. String with comma separated port names. -|priority | ACL Rule property. Rule priority in the table -| | MATCHES -|src_ip | ACL Rule property. Source IP address -|dst_ip | ACL Rule property. Destination IP address -|l4_src_port | ACL Rule property. L4 source port -|l4_dst_port | ACL Rule property. L4 destination port -|l4_src_port_range | ACL Rule property. L4 source ports range. Valid for rules in "L3" tables only -|l4_dst_port_range | ACL Rule property. L4 destination ports range. Valid for rules in "L3" tables only -|ether_type | ACL Rule property. Ethernet type -|ip_protocol | ACL Rule property. Ip protocol -|tcp_flags | ACL Rule property. TCP flags -|ip_type | ACL Rule property. IP type -|dscp | ACL Rule property. Dscp field. Valid for rules "mirror" tables only -|inner_src_ip | ACL Rule property. Inner src ip prefix to match on a VXLAN packet. -|tunnel_vni | ACL Rule property. VXLAN VNI field to match on. -| | ACTIONS -|packet_action | ACL Rule property. Packet actions "forward" or "drop". Valid for rules in "L3" tables only -|mirror_action | Action "mirror". Valid for rules in "mirror" tables only -|inner_src_mac_rewrite_action | Action to rewrite the inner src mac rewrite field. -*Keywords derived from the SAI ACL attributes.* -# Appendix B: Sample input json file -``` - [ - { - "ACL_TABLE:0d41db739a2cc107": { - "policy_desc" : "Permit some traffic, for the customer #4", - "type" : "L3" - "ports" : [ - "port1", - "port2", - "port3" - ] # physical port names - }, - "OP": "SET" - }, - { - "ACL_RULE_TABLE:0d41db739a2cc107:3f8a10ff": { - "priority" : "55", - "IP_PROTOCOL" : "TCP", - "SRC_IP" : "20.0.0.0/25", - "DST_IP" : "20.0.0.0/23", - "L4_SRC_PORT_RANGE: "1024-65535", - "L4_DST_PORT_RANGE: "80-89", - "PACKET_ACTION" : "FORWARD" - }, - "OP": "SET" - }, - ] -``` -# Appendix C: Code sample -Below is the pseudo-code in C which shows how the configuration described in the Appendix B will be applied using SAI API. -```c++ - // SAI API query... - sai_acl_api_t *acl_api; - sai_port_api_t *port_api; - - // Create table - sai_attribute_t table_attrs[] = - { - {.id = SAI_ACL_TABLE_ATTR_STAGE, - .value.s32 = SAI_ACL_STAGE_INGRESS}, - {.id = SAI_ACL_TABLE_ATTR_PRIORITY, - .value.u32 = 10}, - {.id = SAI_ACL_TABLE_ATTR_SIZE, - .value.u32 = 0}, - {.id = SAI_ACL_TABLE_ATTR_FIELD_ETHER_TYPE, - .value.booldata = true}, - {.id = SAI_ACL_TABLE_ATTR_FIELD_IP_TYPE, - .value.booldata = true}, - {.id = SAI_ACL_TABLE_ATTR_FIELD_IP_PROTOCOL, - .value.booldata = true}, - {.id = SAI_ACL_TABLE_ATTR_FIELD_SRC_IP, - .value.booldata = true}, - {.id = SAI_ACL_TABLE_ATTR_FIELD_DST_IP, - .value.booldata = true}, - {.id = SAI_ACL_TABLE_ATTR_FIELD_L4_SRC_PORT, - .value.booldata = true}, - {.id = SAI_ACL_TABLE_ATTR_FIELD_L4_DST_PORT, - .value.booldata = true}, - {.id = SAI_ACL_TABLE_ATTR_FIELD_TCP_FLAGS, - .value.booldata = true}, - {.id = SAI_ACL_TABLE_ATTR_FIELD_RANGE, - .value.s32 = SAI_ACL_RANGE_L4_SRC_PORT_RANGE}, - {.id = SAI_ACL_TABLE_ATTR_FIELD_RANGE, - .value.s32 = SAI_ACL_RANGE_L4_DST_PORT_RANGE} - }; - - size_t attrs_num = sizeof(table_attrs)/sizeof(table_attrs[0]); - - sai_status_t status; - sai_object_id_t acl_table; - - status = acl_api->create_acl_table(&acl_table, attrs_num, table_attrs); - - // Create ranges - sai_object_id_t acl_ranges[2]; - - sai_attribute_t range_attrs[] = - { - {.id = SAI_ACL_RANGE_ATTR_TYPE, - .value.s32 = SAI_ACL_RANGE_L4_SRC_PORT_RANGE}, - {.id = SAI_ACL_RANGE_ATTR_LIMIT, - .value.u32range = (sai_u32_range_t) {.min = 1024, .max = 65535}} - }; - - attrs_num = sizeof(range_attrs)/sizeof(range_attrs[0]); - status = acl_api->create_acl_range(&acl_ranges[0],attrs_num,range_attrs); - status = acl_api->create_acl_range(&acl_ranges[1],...); - - - // Create Entry (rule) - sai_object_id_t entry; - - sai_attribute_t entry_attrs[] = { - {.id = SAI_ACL_ENTRY_ATTR_TABLE_ID, - .value.oid = acl_table}, - {.id = SAI_ACL_ENTRY_ATTR_PRIORITY, - .value.u32 = 55}, - {.id = SAI_ACL_ENTRY_ATTR_ADMIN_STATE, - .value.booldata = true}, - {.id = SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP, - .value.aclfield.data.ip4 = 0x14000000; - .value.aclfield.mask.ip4 = 0xFFFFFF80; - }, - {.id = SAI_ACL_ENTRY_ATTR_FIELD_DST_IP, - .value.aclfield.data.ip4 = 0x14000000; - .value.aclfield.mask.ip4 = 0xFFFFFE00; - }, - {.id = SAI_ACL_ENTRY_ATTR_FIELD_RANGE, - .value.aclfield.data.objlist.list = acl_ranges, - .value.aclfield.data.objlist.count = 2}, - {.id = SAI_ACL_ENTRY_ATTR_PACKET_ACTION, - .value.aclaction.enable = true, - .value.aclaction.parameter.s32 = SAI_PACKET_ACTION_FORWARD} - }; - - attrs_num = sizeof(entry_attrs)/sizeof(entry_attrs[0]); - status = acl_api->create_acl_entry(&entry, attrs_num, entry_attrs)); - - - // Bind ACL table to port - sai_attribute_t port_attr = - { - .id = SAI_PORT_ATTR_INGRESS_ACL_LIST, - .value.objlist.list = acl_table, - .value.objlist.count = 1 - }; - - status = port_api->set_port_attribute(port_object_id, &port_attr); -``` +# ACL in SONiC +# High Level Design Document +### Rev 1.1 + +# Table of Contents + * [List of Tables](#list-of-tables) + * [Revision](#revision) + * [About this Manual](#about-this-manual) + * [Scope](#scope) + * [Definitions/Abbreviation](#definitionsabbreviation) + * [1 Sub-system Overview](#1-sub-system-overview) + * [1.1 System Chart](#11-system-chart) + * [1.2 Modules description](#12-modules-description) + * [1.2.1 swssconfig](#121-swssconfig) + * [1.2.2 App DB](#122-app-db) + * [1.2.3 Orchestration Agent](#123-orchestration-agent) + * [1.2.4 SAI Redis](#124-sai-redis) + * [1.2.5 SAI DB](#125-sai-db) + * [1.2.6 syncd](#126-syncd) + * [1.2.7 SAI (Redis and Switch)](#127-sai-redis-and-switch) + * [2 ACL Subsystem Requirements Overview](#2-acl-subsystem-requirements-overview) + * [2.1 Functional requirements](#21-functional-requirements) + * [2.2 Scalability requirements](#22-scalability-requirements) + * [2.3 Requirements implementation schedule](#23-requirements-implementation-schedule) + * [3 Modules Design](#3-modules-design) + * [3.1 Phase 1](#31-phase-1) + * [3.1.1 swssconfig](#311-swssconfig) + * [3.1.2 App DB](#312-app-db) + * [3.1.2.1 App DB Schema Reference](#3121-app-db-schema-reference) + * [3.1.2.1.1 ACL Tables Table](#31211-acl-tables-table) + * [3.1.2.1.2 ACL Rules Table](#31212-acl-rules-table) + * [3.1.2.2 ACL Table](#3122-acl-table) + * [3.1.2.3 ACL Rule](#3123-acl-rule) + * [3.1.2.4 Table of type "L3"](#3124-table-of-type-l3) + * [3.1.2.5 Table of type "Mirror"](#3125-table-of-type-mirror) + * [3.1.2.6 A Custom Table to perform inner src mac rewrite](#3126-a-custom-table-to-perform-inner-src-mac-rewrite) + * [3.1.3 Orchestration Agent](#313-orchestration-agent) + * [3.1.3.1 Class AclOrch](#3131-class-aclorch) + * [3.1.3.2 Acl Table Create or Delete](#3132-acl-table-create-or-delete) + * [3.1.3.3 Acl Rule Create or Delete](#3133-acl-rule-create-or-delete) + * [3.1.4 SAI Redis](#314-sai-redis) + * [3.1.5 SAI DB](#315-sai-db) + * [3.1.6 syncd](#316-syncd) + * [3.1.7 General updates](#317-general-updates) + * [3.2 Phase 2](#32-phase-2) + * [3.2.1 Orchestration Agent](#321-orchestration-agent) + * [3.2.1.1 Counters](#3211-counters) + * [3.2.1.2 ACL Table Update](#3212-acl-table-update) + * [3.2.1.3 ACL Rule Update](#3213-acl-rule-update) + * [3.2.1.4 Configuration update](#3214-configuration-update) + * [3.3 Phase 3](#33-phase-3) + * [3.3.1 Orchestration Agent](#331-orchestration-agent) + * [3.3.1.1 ACL Ranges support:](#3311-acl-ranges-support) + * [3.3.1.2 Binding ACL Table to Port](#3312-binding-acl-table-to-port) + * [3.3.1.3 ACL and LAG](#3313-acl-and-lag) + * [3.3.1.4 ACL mirroring](#3314-acl-mirroring) + * [4 Flows](#4-flows) + * [4.1 Creating of ACL Objects](#41-creating-of-acl-objects) + * [4.2 Deleting of ACL Objects](#42-deleting-of-acl-objects) + * [4.3 Updating of ACL Objects](#43-updating-of-acl-objects) + * [4.4 Creating of ACL Mirror rules](#44-creating-of-acl-mirror-rules) + * [4.5 Deleting of ACL Mirror rules](#45-deleting-of-acl-mirror-rules) + * [4.6 Mirror state change handling](#46-mirror-state-change-handling) + * [5 swssconfig input file format and restrictions](#5-swssconfig-input-file-format-and-restrictions) + * [6 Testing](#6-testing) + * [6.1 Testing environment](#61-testing-environment) + * [6.2 List of tests to cover basic functionality](#62-list-of-tests-to-cover-basic-functionality) + * [6.3 Additional tests for Phase 2/3](#63-additional-tests-for-pase-23) + * [Appendix A:Keywords for matches and actions](#appendix-akeywords-for-matches-and-actions) + * [Appendix B: Sample input json file](#appendix-b-sample-input-json-file) + * [Appendix C: Code sample](#appendix-c-code-sample) + +# List of Tables +* [Table 1: Revision](#revision) +* [Table 2: Abbreviations](#definitionsabbreviation) +* [Table 3: Implementation schedule](#23-requirements-implementation-schedule) +* [Table 4: Matches allowed in the table of the type "L3"](#3124-table-of-type-l3) +* [Table 5: Actions allowed in the table of the type "L3"](#table-5-actions-allowed-in-the-table-of-the-type-l3) +* [Table 6: Matches allowed in the table of the type "mirror"](#3125-table-of-type-mirror) +* [Table 7: Actions allowed in the table of the type "mirror"](#table-7-actions-allowed-in-the-table-of-the-type-mirror) +* [Table 8: Json file keywords](#table-8-json-file-keywords) + +###### Revision +| Rev | Date | Author | Change Description | +|:---:|:-----------:|:------------------:|-------------------------------------| +| 0.1 | | Andriy Moroz | Initial version | +| 0.2 | 4-Nov-2016 | Andriy Moroz | Fixes after pre-DR | +| 0.3 | 10-Nov-2016 | Andriy Moroz | Updated according to the comments | +| 0.4 | 20-Dec-2016 | Oleksandr Ivantsiv | Update data structures | +| 1.1 | 08-Apr-2025 | Anish Narsian | VXLAN inner src mac rewrite support | +# About this Manual +This document provides general information about the ACL feature implementation in SONiC. +# Scope +This document describes the high level design of the ACL feature. +# Definitions/Abbreviation +###### Table 2: Abbreviations +| Definitions/Abbreviation | Description | +|--------------------------|--------------------------------------------| +| ACL | Access Control List | +| API | Application Programmable Interface | +| SAI | Switch Abstraction Interface | +| ERSPAN | Encapsulated Remote Switched Port Analysis | +| JSON | JavaScript Object Notation | + +# 1 Sub-system Overview +## 1.1 System Chart +Following diagram describes a top level overview of the SONiC Switch components: +![](../raw/gh-pages/images/acl_hld/sonic_sub.png) +## 1.2 Modules description +### 1.2.1 swssconfig +Reads prepared json-files with ACL configuration and injects it into App DB. +### 1.2.2 App DB +Located in the Redis DB instance #0 running inside the container "database". Redis DB works with the data in format of key-value tuples, needs no predefined schema and can hold various types of data. +### 1.2.3 Orchestration Agent +This component is running in the "orchagent" docker container and is resdponsible for processing updates of the App DB and do corresponding changes in the SAI DB via SAI Redis. +### 1.2.4 SAI Redis +SAI Redis is an implementation of the SAI API which translates API calls into SAI objects which are stored in the SAI DB. Already handles ACL data. +### 1.2.5 SAI DB +Redis DB instance #1. Holds serialized SAI objects. +### 1.2.6 syncd +Reads SAI DB data (SAI objects) and performs appropriate calls to Switch SAI. +### 1.2.7 SAI (Redis and Switch) +An unified API which represent the switch state as a set of objects. In SONiC represented in two implementations - SAI DB frontend and ASIC SDK wrapper. +# 2 ACL Subsystem Requirements Overview +## 2.1 Functional requirements +*Mostly copy-paste from the provided acl.md* + +- Support data plane ACL in SONiC (M) +- Support ACL table which contains a set of ACL rules (M) +ACL table has predefined type, each type defines the a set of match fields and actions available for the table. For example, mirror acl table only supports mirror as an action. (M) +- Support binding ACL table to ports, initially only support for front panel physical port binding (M) +- Support binding multiple ACL tables to ports. The use case is to have data plane ACL table which do permit/deny while have mirror ACL table do packet mirror for a same packet. Initial, there will be no conflicting actions between two ACL tables bound to the same set of ports. (M) +- Support matching ip src/dst, ip protocol, tcp/udp port in ACL rules (M) +- Support port range matching in ACL rules (M) +- Support permit/deny action in ACL rules (M) +- Support packet erspan mirror action in ACL rules (M) +- Packet counters for each acl rule (M) +- Byte counters for each acl rule (S) +- Support rewriting the inner src mac field of a vxlan packet by matching on an Inner Src IP prefix and VXLAN VNI + +## 2.2 Scalability requirements +- 1K ACL rules for L3 acl table +- 256 ACL rules for mirror +- 10K ACL rules for inner src mac rewrite + +## 2.3 Requirements implementation schedule +###### **Table 3: Implementation schedule** +Requirement| Implementation Phase |Comment +-----------|----------------------|------- +Support data plane ACL in SONiC (M) | Phase 1 +Support ACL table which contains a set of ACL rules (M) | Phase 1 +ACL table has predefined type, each type defines the a set of match fields and actions available for the table. For example, mirror acl table only supports mirror as an action. (M) | Phase 1 +Support binding ACL table to ports, initially only support for front panel physical port binding (M) | Phase 3 | Phase 1 ? +Support binding multiple ACL tables to ports. The use case is to have data plane ACL table which do permit/deny while have mirror ACL table do packet mirror for a same packet. Initial, there will be no conflicting actions between two ACL tables bound to the same set of ports. (M) | Phase 3 +Support matching ip src/dst, ip protocol, tcp/udp port in ACL rules (M) | Phase 1 +Support port range matching in ACL rules (M) | Phase 3 | Phase 1 ? +Support permit/deny action in ACL rules (M) | Phase 1 +Support packet erspan mirror action in ACL rules (M) | Phase 1 | Phase 3 +Packet counters for each acl rule (M) | Phase2 +Byte counters for each acl rule (S) | Phase2 +ACL and LAG | Phase 3 +Configuration update | Phase 2 + +# 3 Modules Design +## 3.1 Phase 1 +In the Phase 1 there will be implemented basic ACL functionality: complete data flow (from input json file to ASIC), creating/removing ACL Tables and ACL Rules, rules will support simple matching (all except ranges) and permit/deny actions. +### 3.1.1 swssconfig +Swssconfig is generic enough and probably needs no update to support ACL. +Make sure it supports the ACL configuration json provided in the Appendix B +### 3.1.2 App DB +No update is needed to support ACL. +#### 3.1.2.1 App DB Schema Reference +##### 3.1.2.1.1 ACL Tables Table + key = ACL_TABLE:name ; acl_table_name must be unique + ;field = value + policy_desc = 1*255VCHAR ; name of the ACL policy table description + type = "mirror"/"l3" ; type of acl table, every type of + ; table defines the match/action a + ; specific set of match and actions. + ports = [0-max_ports]*port_name ; the ports to which this ACL + ; table is applied, can be emtry + ; value annotations + port_name = 1*64VCHAR ; name of the port, must be unique + max_ports = 1*5DIGIT ; number of ports supported on the chip +##### 3.1.2.1.2 ACL Rules Table + key: ACL_RULE_TABLE:table_name:rule_name ; key of the rule entry in the table, + ; seq is the order of the rules + ; when the packet is filtered by the + ; ACL "policy_name". + ; A rule is always associated with a + ; policy. + + ;field = value + PRIORITY = 1*3DIGIT ; rule priority. Valid values range + ; could be platform dependent + + PACKET_ACTION = "forward"/"drop"/"mirror" ; action when the fields are + ; matched (mirror action only + ; available to mirror acl table + ; type) + + MIRROR_ACTION = 1*255VCHAR ; refer to the mirror session + ; (only available to mirror acl + ; table type) + + INNER_SRC_MAC_REWRITE_ACTION = 12HEXDIG ; Rewrite the inner mac field of a VXLAN packet with the + ; provided value (must also define an associated custom ACL_TABLE_TYPE + ; as per https://github.com/sonic-net/SONiC/blob/master/doc/acl/ACL-Table-Type-HLD.md) + + ETHER_TYPE = h16 ; Ethernet type field + + IP_TYPE = ip_types ; options of the l2_protocol_type + ; field. Only v4 is support for + ; this stage. + + IP_PROTOCOL = h8 ; options of the l3_protocol_type field + + SRC_IP = ipv4_prefix ; options of the source ipv4 + ; address (and mask) field + + DST_IP = ipv4_prefix ; options of the destination ipv4 + ; address (and mask) field + + L4_SRC_PORT = port_num ; source L4 port or the + L4_DST_PORT = port_num ; destination L4 port + + L4_SRC_PORT_RANGE = port_num_L-port_num_H ; source ports range of L4 ports field + l4_DST_PORT_RANGE = port_num_L-port_num_H ; destination ports range of L4 ports field + + TCP_FLAGS = h8/h8 ; TCP flags field and mask + DSCP = h8 ; DSCP field (only available for mirror + ; table type) + + TUNNEL_VNI = DIGITS ; 1 to 16 million VNI values to match on + INNER_SRC_IP = ipv4_prefix ; Inner src IPv4 prefix to match on + + ;value annotations + ip_types = any | ip | ipv4 | ipv4any | non_ipv4 | ipv6any | non_ipv6 + port_num = 1*5DIGIT ; a number between 0 and 65535 + port_num_L = 1*5DIGIT ; a number between 0 and 65535, + ; port_num_L < port_num_H + port_num_H = 1*5DIGIT ; a number between 0 and 65535, + ; port_num_L < port_num_H + ipv6_prefix = 6( h16 ":" ) ls32 + / "::" 5( h16 ":" ) ls32 + / [ h16 ] "::" 4( h16 ":" ) ls32 + / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32 + / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32 + / [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32 + / [ *4( h16 ":" ) h16 ] "::" ls32 + / [ *5( h16 ":" ) h16 ] "::" h16 + / [ *6( h16 ":" ) h16 ] "::" + h8 = 1*2HEXDIG + h16 = 1*4HEXDIG + ls32 = ( h16 ":" h16 ) / IPv4address + ipv4_prefix = dec-octet "." dec-octet "." dec-octet "." dec-octet “/” %d1-32 + dec-octet = DIGIT ; 0-9 + / %x31-39 DIGIT ; 10-99 + / "1" 2DIGIT ; 100-199 + / "2" %x30-34 DIGIT ; 200-249 +#### 3.1.2.2 ACL Table +ACL Tables will be added to the App DB under the key ACL_TABLE:table_id. table_id is some string which will be specified by the user and should be unique across the App DB. table__id will be used to refer the table when adding rules and updating or deleting the table. +Tables will have the following properties: + +- policy_desc name of the ACL policy table description +- type one of the two predefined table types: "L3" or "mirror" +- ports the list or ports bound to the table + +Table type defines also a list of supported matches that could be used in rules belonging to this table. +#### 3.1.2.3 ACL Rule +ACLRules will be added to the App DB under the key ACL_RULE_TABLE:table_id:rule_id. table_id is the table ID the rule belongs to and the rule_id is some string which should be unique across the Table. rule_id will be used to refer the Rule when it is needed to update or delete the Rule. +Rules will have the following properties: + +- priority - rule priority in the table +- match:value - packet properties this rule will match +- action:value - action to be applied to the rule if match was successful + +The list of allowed matches and actions depends on the table the rule will go to. Complete list of supported matches and actions provided in chapters 3.1.3.4 and 3.1.3.5. +#### 3.1.2.4 Table of type "L3" +###### **Table 4: Matches allowed in the table of the type "L3"** +Keyword for the match criteria | Type | Description +-------------------------------|------|------------ +ETHER_TYPE | uint16_t | Hexadecimal integer [0..FFFF] +IP_TYPE | string | One of: "IPv4"/"NON_IPv4"/"ARP" +IP_PROTOCOL | uint8_t | Hexadecimal unsigned integer [0..FF] +SRC_IP | ip_address | A valid IPv4 subnet in format IP/Mask +DST_IP | ip_address | A valid IPv4 subnet in format IP/Mask +L4_SRC_PORT | uint16_t | Decimal unsigned integer [0..65535] +L4_DST_PORT | uint16_t | Decimal unsigned integer [0..65535] +TCP_FLAGS | uint8_t | Hexadecimal unsigned integer [0..FF] +L4_SRC_PORT_RANGE | uint16_t, uint16_t | Two dash separated decimal unsigned integers [0..65535] +L4_DST_PORT_RANGE | uint16_t, uint16_t | Two dash separated decimal unsigned integers [0..65535] + +###### **Table 5: Actions allowed in the table of the type "L3"** +Keyword for the action type | Type | Description +-------------------------------|------|------------ +PACKET_ACTION | string | Packet action value: "FORWARD" or "DROP" + +#### 3.1.2.5 Table of type "Mirror" +###### **Table 6: Matches allowed in the table of the type "mirror"** +Keyword for the match criteria | Type | Description +-------------------------------|------|------------ +IP_PROTOCOL | uint8_t | IP protocol type in hexadecimal format [0..FF] +DSCP | uint8_t | Hexadecimal unsigned integer [0..FF] +SRC_IP | ip_addr/mask | A valid IPv4 subnet in format IP/Mask +DST_IP | ip_addr/mask | A valid IPv4 subnet in format IP/Mask +L4_SRC_PORT | uint16_t | Decimal unsigned integer [0..65535] +L4_DST_PORT | uint16_t | Decimal unsigned integer [0..65535] +###### **Table 7: Actions allowed in the table of the type "mirror"** +Keyword for the action type | Type | Description +-------------------------------|------|------------ +MIRROR_ACTION | string | Mirror session name + +#### 3.1.2.6 A Custom Table to perform inner src mac rewrite +###### **Table 8: Matches used to perform vxlan inner packet src mac rewrite** +Keyword for the match criteria | Type | Description +-------------------------------|------|------------ +TUNNEL_VNI | uint24_t | VXLAN VNI to match in a VXLAN packet +INNER_SRC_IP | ip_addr/mask | A valid IPv4 subnet in format IP/Mask +###### **Table 9: Actions used to perform vxlan inner packet src mac rewrite** +Keyword for the action type | Type | Description +-------------------------------|------|------------ +INNER_SRC_MAC_REWRITE_ACTION | mac_address | Mac address to use when rewriting the inner src mac field of a vxlan packet + +### 3.1.3 Orchestration Agent +Orchestration Agent needs to be updated in order to support ACL in the AppDB and the SAI ACL API. There will be class AclOrch and a set of data structures implemented to handle ACL feature. +Tables or rules create, delete and update Orchestration Agent will process basing on App DB changes. Some object updates updates will be handled and some will be considered as invalid. +See Chapter 5 for the details. + +#### 3.1.3.1 Class AclOrch +Class AclOrch will hold a set of methods matching generic Orch class pattern to handle App DB updates. The class will be initialized with the list of ACL tables to subscribe to the appropriate App DB updates. doTask() method will be called on tables update and will distribute handling DB update between the other handlers basing on a table which was updated. +Below is the skeleton of the AclOrch class: +```cpp + struct AclRule { + sai_object_id_t saiId; + string rule_id; + map matches; + string action; // array? + }; + + Struct/class AclTable { + sai_object_id_t saiId; + string table_id; + string description; // needed? + table_type_t m_type; + vector m_rules; + }; + + class AclOrch : public Orch { + void doTask(); + void doAclTableTask(); + void doAclRuleTask(); + ... + vector m_AclTables; + } +``` +This class will be responsible for: + +- processing updates of the ACL tables (create/delete/update) +- partial input data (App DB) validation (including cross-table validation) +- replicating ACL data from the App DB to the SAI DB via SAIRedis +- caching of the ACL objects in order to detect objects update and perform state dump. +#### 3.1.3.2 Acl Table Create or Delete +AclOrch class will inherit and reuse Orch class functionality which exploits producer-consumer mechanism (implemented in swss-common) to track changes in the Redis database tables. ACL Tables are stores under ACL_TABLE:* keys in App DB. On ACL_TABLE update in the App DB AclOrch::doAclTableTask() will be called to process the change. On table create AclOrch will verify if the table already exists (using table_id) creating of the table which already exists will be processed as update. Regular create or delete will update the internal class structures and appropriate SAI objects will be created or deleted. +Validation: on create validate table type. +#### 3.1.3.3 Acl Rule Create or Delete +ACL Rules are stores under ACL_RULE_TABLE:* keys in App DB. On ACL_RULE_TABLE update in the App DB AclOrch::doAclRuleTask() will be called to process the change. On table create AclOrch will verify if the rule already exists (using rule_id) creating of the rule which already exists will be processed as update. Regular create or delete will update the internal class structures and appropriate SAI objects will be created or deleted. +Validation: make sure the table exists, the list of match criteria is valid and fits the table, the list of actions is valid. +### 3.1.4 SAI Redis +No updates in Phase 1. +### 3.1.5 SAI DB +No updates in Phase 1. +### 3.1.6 syncd +No updates in Phase 1. +### 3.1.7 General updates +Add definitions for the table names "ACL_TABLE" and "ACL_RULE_TABLE" to the schema.h +## 3.2 Phase 2 +### 3.2.1 Orchestration Agent +#### 3.2.1.1 Counters +Add handling of counter action for tables and rules. This assumes automatic counter object creation and adding it to each rule on create and removing on delete. +```c++ + struct AclRule { + sai_object_id_t saiId; + sai_object_id_t counter_oid; + string rule_id; + map matches; + string action; // array? + }; +``` +There will counters to register number of packets and number of bytes processed by the rule. +Counters will be stored to the DB #2 with the predefined period. Update period will be hard coded. The default value will be 10 seconds. +DB Schema for ACL counters is the following: + + COUNTERS:ACL_TABLE_NAME:ACL_RULE_NAME + Packets : + Bytes : + +#### 3.2.1.2 ACL Table Update +If an update refers the table which already exists, this change will be considered as update. This will cause updating of internal records as well as corresponding SAI objects. Updating SAI objects may require recreating them. +#### 3.2.1.3 ACL Rule Update +If an update refers the rule which already exists, this change will be considered as update. This will cause updating of internal records as well as corresponding SAI objects. +Validation: similar to the one performed on create. +#### 3.2.1.4 Configuration update +Besides strait forward "delete-create" way of update need to consider performing "safe update" when a new configuration will be created prior to removing the old one. And switch to the new configuration only if it is successfully created. This will require resolving at least two issues: + +- need to be sure there are enough hardware resources to hold both old and new configurations +- update should be "atomic". I.e. Orchestration Agent should receive an entire update before starting an update. + +Validation: similar to the one performed on create. + +## 3.3 Phase 3 +In Phase 3 there will be implemented ACL Ranges support and ACLTable to port binding. +### 3.3.1 Orchestration Agent +#### 3.3.1.1 ACL Ranges support: +In Orchestration Agent in class AclOrch: +```c++ + struct AclRange { + sai_object_id_t saiId; + tuple range; + } + + struct AclCounter { + sai_object_id_t saiId; + } + + struct AclRule { + sai_object_id_t saiId; + string rule_id; + map matches; + string action; // array? + AclCounter byteCounter; + AclCounter packetCounter; + }; + + struct/class AclTable { + sai_object_id_t saiId; + string table_id; + string description; // needed? + table_type_t m_type; + vector m_ports; + vector m_rules; + }; + + class AclOrch : public Orch { + void doTask(); + void doAclTableTask(); + void doAclRuleTask(); + ... + vector m_AclTables; + map , AclRange> m_AclRanges; + } +``` +Add handling, caching and validation of range matching. This also includes detecting and reusing of identical ranges in order to save hardware resources. + +#### 3.3.1.2 Binding ACL Table to Port +While declaring ACL table in a json config file it is mandatory to specify a port or the list of ports this table will be bound to. Starting from the SAI v1.0 multiple tables cannot be bound to one port. To implement this feature tables first have to be added to a group and then group could be bound to the port. +Groups will be created and managed by Ports (`class Port`, implemented in `orchagent/port.cpp`). `PortsOrch` class API will be extended with the method `getPort` to return an appropriate Port class instance. The Port class will provide method `bindAclTable` which will handle creation of the group, binding this group to the port and adding given ACL table to the corresponding group. + +Code sample which binds table to the port: + + sai_status_t AclOrch::bindAclTable(sai_object_id_t table_oid,..) + { + for (const auto& portOid : aclTable.ports) + { + Port port; + gPortsOrch->getPort(portOid, port); + + sai_object_id_t group_member_oid; + status = port.bindAclTable(group_member_oid, table_oid); + ... +If LAG port not created yet when bind ACL table to it, LAG port will be added to an internal pending port list, after LAG port created, AclOrch will get notification from STATE_DB, and will bind the ACL table to the LAG port. This is implemented by adding a "doAclTablePortUpdateTask" to handle the port configured notification from STATE_DB. + +#### 3.3.1.3 ACL and LAG + +- LAG member port shall not be added to the ACL Tables, or will be considered as invalid configuration and return fail. +- LAG ACL configurations will be automatically applied to all the LAG members, this is done by SAI/SDK. + +#### 3.3.1.3 ACL mirroring +```c++ + class AclRule + { + public: + AclRule(AclOrch *aclOrch, string rule, string table); + virtual bool validateAddPriority(string attr_name, string attr_value); + virtual bool validateAddMatch(string attr_name, string attr_value); + virtual bool validateAddAction(string attr_name, string attr_value) = 0; + virtual bool validate() = 0; + bool processIpType(string type, sai_uint32_t &ip_type); + + virtual bool create(); + virtual bool remove(); + virtual void update(SubjectType, void *) = 0; + + string getId() + { + return id; + } + + string getTableId() + { + return table_id; + } + + sai_object_id_t getCounterOid() + { + return counter_oid; + } + + static shared_ptr makeShared(acl_table_type_t type, AclOrch *acl, MirrorOrch *mirror, string rule, string table); + virtual ~AclRule() {}; + + protected: + virtual bool createCounter(); + virtual bool removeCounter(); + + AclOrch *aclOrch; + string id; + string table_id; + sai_object_id_t table_oid; + sai_object_id_t rule_oid; + sai_object_id_t counter_oid; + uint32_t priority; + map matches; + map actions; + }; + + class AclRuleL3: public AclRule + { + public: + AclRuleL3(AclOrch *aclOrch, string rule, string table); + + bool validateAddAction(string attr_name, string attr_value); + bool validate(); + void update(SubjectType, void *); + }; + + class AclRuleMirror: public AclRule + { + public: + AclRuleMirror(AclOrch *aclOrch, MirrorOrch *mirrorOrch, string rule, string table); + bool validateAddAction(string attr_name, string attr_value); + bool validate(); + bool create(); + bool remove(); + void update(SubjectType, void *); + AclRuleCounters getCounters(); + + protected: + bool m_state; + string sessionName; + acl_stage_type_t m_tableStage; + AclRuleCounters counters; + MirrorOrch *m_pMirrorOrch; + }; + + struct AclTable { + string id; + string description; + acl_table_type_t type; + ports_list_t ports; + // Map rule name to rule data + map> rules; + AclTable(): type(ACL_TABLE_UNKNOWN) {} + }; +``` +To support mirror action bind to both ingress and egress ACL rule, an member "acl_stage_type_t m_tableStage" added +to class AclRuleMirror to indicate the stage the ACL mirror rule, according to the stage can select proper mirror +action, "SAI_ACL_ENTRY_ATTR_ACTION_MIRROR_INGRESS" for ingress ACL rule, "SAI_ACL_ENTRY_ATTR_ACTION_MIRROR_EGRESS" +for egress ACL rule. +Add possibility to receive updates about mirror sessions state change and perform mirroring rules state change accordingly. +# 4 Flows +## 4.1 Creating of ACL Objects +![](acl_create.png) +## 4.2 Deleting of ACL Objects +![](acl_delete.png) +## 4.3 Updating of ACL Objects +Depending on the number of changed properties in the updated ACL object, update may include one or more extra delete/create calls to the SAI Redis. +![](acl_update.png) +## 4.4 Creating of ACL Mirror rules +![](acl_mirror_rule_flow.svg) +## 4.5 Deleting of ACL Mirror rules +![](mirror_delete.png) +## 4.6 Mirror state change handling +![](mirror_state_change.png) +# 5 swssconfig input file format and restrictions +- Valid json file. The file should be in the format swssconfig can process. This assumes lists surrounded by square brackets, dictionaries with curly brackets (braces), tuples inside dictionary separated with semicolon and enumerated elements separated with the comma. +- Logical consistency. The configuration provided should be complete. Rules should not refer non-existing tables, etc. +- Order: Tables should appear before Rules. +- The list of keywords to be used to address different match criteria and actions provided in Appendix A +- Rules should have at least one match criteria and one action +- List of ports to bind to the table should contain physical port names. +- Maximum number of rules allowed: 1000 rules total in the all "L3" tables and 256 rules total in all "Mirror" tables. +See json file example is in Appendix B. +# 6 Testing +## 6.1 Testing environment +Ansible + PTF +## 6.2 List of tests to cover basic functionality +- simple permit (any) +- simple deny (any) +- permit/deny with matching (IP, port, ethertype, etc) +## 6.3 Additional tests for Phase 2/3 +- permit/deny and counter +- permit/deny with range +- permit/deny with two ranges (src, dst) + +# Appendix A:Keywords for matches and actions +###### **Table 8: Json file keywords** +|Keyword | Description| +|------------|------------| +|policy_desc | ACL Table property, contains human readable table description string +|type | ACL Table property. Could be "L3" or "Mirror" +|ports | ACL Table property. String with comma separated port names. +|priority | ACL Rule property. Rule priority in the table +| | MATCHES +|src_ip | ACL Rule property. Source IP address +|dst_ip | ACL Rule property. Destination IP address +|l4_src_port | ACL Rule property. L4 source port +|l4_dst_port | ACL Rule property. L4 destination port +|l4_src_port_range | ACL Rule property. L4 source ports range. Valid for rules in "L3" tables only +|l4_dst_port_range | ACL Rule property. L4 destination ports range. Valid for rules in "L3" tables only +|ether_type | ACL Rule property. Ethernet type +|ip_protocol | ACL Rule property. Ip protocol +|tcp_flags | ACL Rule property. TCP flags +|ip_type | ACL Rule property. IP type +|dscp | ACL Rule property. Dscp field. Valid for rules "mirror" tables only +|inner_src_ip | ACL Rule property. Inner src ip prefix to match on a VXLAN packet. +|tunnel_vni | ACL Rule property. VXLAN VNI field to match on. +| | ACTIONS +|packet_action | ACL Rule property. Packet actions "forward" or "drop". Valid for rules in "L3" tables only +|mirror_action | Action "mirror". Valid for rules in "mirror" tables only +|inner_src_mac_rewrite_action | Action to rewrite the inner src mac rewrite field. +*Keywords derived from the SAI ACL attributes.* +# Appendix B: Sample input json file +``` + [ + { + "ACL_TABLE:0d41db739a2cc107": { + "policy_desc" : "Permit some traffic, for the customer #4", + "type" : "L3" + "ports" : [ + "port1", + "port2", + "port3" + ] # physical port names + }, + "OP": "SET" + }, + { + "ACL_RULE_TABLE:0d41db739a2cc107:3f8a10ff": { + "priority" : "55", + "IP_PROTOCOL" : "TCP", + "SRC_IP" : "20.0.0.0/25", + "DST_IP" : "20.0.0.0/23", + "L4_SRC_PORT_RANGE: "1024-65535", + "L4_DST_PORT_RANGE: "80-89", + "PACKET_ACTION" : "FORWARD" + }, + "OP": "SET" + }, + ] +``` +# Appendix C: Code sample +Below is the pseudo-code in C which shows how the configuration described in the Appendix B will be applied using SAI API. +```c++ + // SAI API query... + sai_acl_api_t *acl_api; + sai_port_api_t *port_api; + + // Create table + sai_attribute_t table_attrs[] = + { + {.id = SAI_ACL_TABLE_ATTR_STAGE, + .value.s32 = SAI_ACL_STAGE_INGRESS}, + {.id = SAI_ACL_TABLE_ATTR_PRIORITY, + .value.u32 = 10}, + {.id = SAI_ACL_TABLE_ATTR_SIZE, + .value.u32 = 0}, + {.id = SAI_ACL_TABLE_ATTR_FIELD_ETHER_TYPE, + .value.booldata = true}, + {.id = SAI_ACL_TABLE_ATTR_FIELD_IP_TYPE, + .value.booldata = true}, + {.id = SAI_ACL_TABLE_ATTR_FIELD_IP_PROTOCOL, + .value.booldata = true}, + {.id = SAI_ACL_TABLE_ATTR_FIELD_SRC_IP, + .value.booldata = true}, + {.id = SAI_ACL_TABLE_ATTR_FIELD_DST_IP, + .value.booldata = true}, + {.id = SAI_ACL_TABLE_ATTR_FIELD_L4_SRC_PORT, + .value.booldata = true}, + {.id = SAI_ACL_TABLE_ATTR_FIELD_L4_DST_PORT, + .value.booldata = true}, + {.id = SAI_ACL_TABLE_ATTR_FIELD_TCP_FLAGS, + .value.booldata = true}, + {.id = SAI_ACL_TABLE_ATTR_FIELD_RANGE, + .value.s32 = SAI_ACL_RANGE_L4_SRC_PORT_RANGE}, + {.id = SAI_ACL_TABLE_ATTR_FIELD_RANGE, + .value.s32 = SAI_ACL_RANGE_L4_DST_PORT_RANGE} + }; + + size_t attrs_num = sizeof(table_attrs)/sizeof(table_attrs[0]); + + sai_status_t status; + sai_object_id_t acl_table; + + status = acl_api->create_acl_table(&acl_table, attrs_num, table_attrs); + + // Create ranges + sai_object_id_t acl_ranges[2]; + + sai_attribute_t range_attrs[] = + { + {.id = SAI_ACL_RANGE_ATTR_TYPE, + .value.s32 = SAI_ACL_RANGE_L4_SRC_PORT_RANGE}, + {.id = SAI_ACL_RANGE_ATTR_LIMIT, + .value.u32range = (sai_u32_range_t) {.min = 1024, .max = 65535}} + }; + + attrs_num = sizeof(range_attrs)/sizeof(range_attrs[0]); + status = acl_api->create_acl_range(&acl_ranges[0],attrs_num,range_attrs); + status = acl_api->create_acl_range(&acl_ranges[1],...); + + + // Create Entry (rule) + sai_object_id_t entry; + + sai_attribute_t entry_attrs[] = { + {.id = SAI_ACL_ENTRY_ATTR_TABLE_ID, + .value.oid = acl_table}, + {.id = SAI_ACL_ENTRY_ATTR_PRIORITY, + .value.u32 = 55}, + {.id = SAI_ACL_ENTRY_ATTR_ADMIN_STATE, + .value.booldata = true}, + {.id = SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP, + .value.aclfield.data.ip4 = 0x14000000; + .value.aclfield.mask.ip4 = 0xFFFFFF80; + }, + {.id = SAI_ACL_ENTRY_ATTR_FIELD_DST_IP, + .value.aclfield.data.ip4 = 0x14000000; + .value.aclfield.mask.ip4 = 0xFFFFFE00; + }, + {.id = SAI_ACL_ENTRY_ATTR_FIELD_RANGE, + .value.aclfield.data.objlist.list = acl_ranges, + .value.aclfield.data.objlist.count = 2}, + {.id = SAI_ACL_ENTRY_ATTR_PACKET_ACTION, + .value.aclaction.enable = true, + .value.aclaction.parameter.s32 = SAI_PACKET_ACTION_FORWARD} + }; + + attrs_num = sizeof(entry_attrs)/sizeof(entry_attrs[0]); + status = acl_api->create_acl_entry(&entry, attrs_num, entry_attrs)); + + + // Bind ACL table to port + sai_attribute_t port_attr = + { + .id = SAI_PORT_ATTR_INGRESS_ACL_LIST, + .value.objlist.list = acl_table, + .value.objlist.count = 1 + }; + + status = port_api->set_port_attribute(port_object_id, &port_attr); +``` diff --git a/doc/acl/ACL-Ingress-Egress-test-plan.md b/doc/acl/ACL-Ingress-Egress-test-plan.md index 903679ad28e..5b2921538fd 100644 --- a/doc/acl/ACL-Ingress-Egress-test-plan.md +++ b/doc/acl/ACL-Ingress-Egress-test-plan.md @@ -71,7 +71,7 @@ When t1 topology is used, the tables can bind to all ports. When t1-lag and t1-6 A same set of improved ACL rules can be used for both ingress and egress ACL testing. While testing ingress ACL, it is always possible to hit the rules. While testing egress ACL, destination IP address of the injected packet must be routable. Otherwise, the injected packet would never get a chance to hit the egress rule. -For completness, both packet flow directions will be covered: +For completeness, both packet flow directions will be covered: * TOR ports -> SPINE ports: Inject packet into tor ports. Set destination IP address to BGP routes learnt on spine ports. Check the packet on spine ports. * SPINE ports -> TOR ports: Inject packet into spine ports. Set destination IP address to BGP routes learnt on tor ports. Check the packet on tor ports. @@ -81,7 +81,7 @@ Work need to be done based on this strategy and existing scripts: * Update the existing acltb.yml script: * Backup config_db. * Create ACL tables and load ACL rules for testing. - * Run the PTF scritp. + * Run the PTF script. * Restore configuration after testing. * Update the PTF script * Add more test cases for the improved set of ACL rules. @@ -91,7 +91,7 @@ Work need to be done based on this strategy and existing scripts: * Improve the existing ACL rules to address issue that RULE_12 and RULE_13 are not hit. * Extend the existing ACL rules to cover more DROP action. The PTF script should be extended accordingly too. * Change source IP to addresses that are not used by other devices in current topologies - * Add two rules to always allow BGP packets. Othewise, BGP routes will be lost. + * Add two rules to always allow BGP packets. Otherwise, BGP routes will be lost. * Add a new ansible module for gathering ACL counters in DUT switch. * Check counters of ACL rules after each PTF script execution. @@ -107,7 +107,7 @@ The ACL rules will be improved too: * Add a new set of rules RULE_14 to RULE_26 for testing DROP action. * RULE_12 and RULE_13 should use source IP address different with RULE_1, for example 20.0.0.4/32. Otherwise packets with source IP 20.0.0.2/32 would always match RULE_1 and never hit RULE_12 and RULE_13. The PTF script testing case 10 and 11 need to use this new source IP address for the injected packets. * RULE_25 and RULE_26 should use source IP address different with: RULE_1, RULE_12, RULE_13 and RULE_14. Otherwise, RULE_25 and RULE_26 will never be hit. -* RULE_27 and RULE_28 are added to always alow BGP traffic. Otherwise, BGP traffic would be blocked by the DEFAULT_RULE. +* RULE_27 and RULE_28 are added to always allow BGP traffic. Otherwise, BGP traffic would be blocked by the DEFAULT_RULE. The ACL rules should not be all loaded at the same time. diff --git a/doc/acl/ACL-Table-Type-HLD.md b/doc/acl/ACL-Table-Type-HLD.md index a4153a4cb64..6dad3ed9527 100644 --- a/doc/acl/ACL-Table-Type-HLD.md +++ b/doc/acl/ACL-Table-Type-HLD.md @@ -88,7 +88,7 @@ ACL table create-only SAI attributes include a list of match fields, bind point to pass on table creation, which is defined by SAI_SWITCH_ATTR_ACL_STAGE_INGRESS, SAI_SWITCH_ATTR_ACL_STAGE_EGRESS in sai_acl_capability_t structure, field is_action_list_mandatory. ```abnf -key: ACL_TABLE_TYPE:name ; key of the ACL table type entry. The name is arbitary name user chooses. +key: ACL_TABLE_TYPE:name ; key of the ACL table type entry. The name is arbitrary name user chooses. ; field = value matches = match-list ; list of matches for this table, matches are same as in ACL_RULE table. actions = action-list ; list of actions for this table, actions are same as in ACL_RULE table. @@ -301,7 +301,7 @@ other orchs to override the behavior.

The polymorphism of existing AclRule derivatives AclRuleL3, AclRuleL3V6, etc is in the validation methods. -E.g. AclRuleL3's validateAddMatch() method checks wether the match is one of the L3 table type matches. +E.g. AclRuleL3's validateAddMatch() method checks whether the match is one of the L3 table type matches. This can be consolidated into single generic validate() method that validations ACL Rule configuration by checking AclRule SAI matches and SAI actions against AclTable SAI attributes. @@ -420,7 +420,7 @@ tables referencing it. ### System tests - Existing ACL/Everflow tests cover default table types coming from init_cfg.json, which means it is covering the flow of creating table types. -- Extend existing ACL/Everflow tests with a fixture to create custom table types that will be a copy of a default onces and run the same test cases. +- Extend existing ACL/Everflow tests with a fixture to create custom table types that will be a copy of a default ones and run the same test cases. - Warm/Fast reboot tests to verify the functionality with new changes. ### Open questions @@ -430,4 +430,4 @@ tables referencing it. - Does YANG infrastructure in SONiC supports validation against STATE DB information (e.g. is_action_list_mandatory)? - Does this feature needs a similar capability table for match fields? What is the SAI API to query it? - Currently SAI object API allows to query for an ACL table attributes CREATE, SET, GET operations implementation availability (sai_query_attribute_capability, sai_attr_capability_t), - but does not tell wether it is supported or not. Can we assume if it is not implemented it is not supported? + but does not tell whether it is supported or not. Can we assume if it is not implemented it is not supported? diff --git a/doc/acl/acl.md b/doc/acl/acl.md index 3a962436fb9..8d818e6bf3e 100644 --- a/doc/acl/acl.md +++ b/doc/acl/acl.md @@ -48,7 +48,7 @@ Define rules associated with a specific ACL Policy key: ACL_RULE_TABLE:table_name:seq ; key of the rule entry in the table, seq is the order of the rules ; when the packet is filtered by the ACL "policy_name". - ; A rule is always assocaited with a policy. + ; A rule is always associated with a policy. ;field = value action = "permit"/"deny" ; action when the fields are matched (mirror action only available to mirror acl table type) @@ -57,12 +57,12 @@ Define rules associated with a specific ACL Policy l3_prot_type = "icmp"/"tcp"/"udp"/"any" ; options of the l3_protocol_type field ipv4_src = ipv4_prefix/"any" ; options of the source ipv4 address (and mask) field ipv4_dst = ipv4_prefix/"any" ; options of the destination ipv4 address (and mask) field - ; l2_prot_type detemines which set of the addresses taking effect, v4 or v6. + ; l2_prot_type determines which set of the addresses taking effect, v4 or v6. l4_src_port = port_num/[port_num_L-port_num_H] ; source L4 port or the range of L4 ports field l4_dst_port = port_num/[port_num_L-port_num_H] ; destination L4 port or the range of L4 ports field ;value annotations - seq = DIGITS ; unique sequence number of the rules assocaited within this ACL policy. + seq = DIGITS ; unique sequence number of the rules associated within this ACL policy. ; When applying this ACL policy, the seq determines the order of the ; rules applied. port_num = 1*5DIGIT ; a number between 0 and 65535 diff --git a/doc/acl/acl_stage_capability.md b/doc/acl/acl_stage_capability.md index 45494143d06..6f90fe6920b 100644 --- a/doc/acl/acl_stage_capability.md +++ b/doc/acl/acl_stage_capability.md @@ -18,7 +18,7 @@ E.g.: Egress mirror action on ingress stage or vice versa might be not supported SAI API has two mirror action types - SAI_ACL_ACTION_TYPE_MIRROR_INGRESS, SAI_ACL_ACTION_TYPE_MIRROR_EGRESS which can be set on ingress or egress table. So SONiC will not restrict setting egress mirror rule on ingress table or vice versa. -To check wheter such combination is supported by the ASIC application should look into SWITCH_CAPABILITY table which is described in part 2 of this document. +To check whether such combination is supported by the ASIC application should look into SWITCH_CAPABILITY table which is described in part 2 of this document. The proposed new schema: @@ -150,7 +150,7 @@ AclRuleMirror::validateAddAction(string attr_name, string attr_value) Test case 1: -VS test cases update to check for differnt combinations ingress/egress table and ingress/egress mirror rule creation +VS test cases update to check for different combinations ingress/egress table and ingress/egress mirror rule creation ### system level testing @@ -219,7 +219,7 @@ Check negative flow in case action is not supported. For ingress and egress tables: - Set custom SAI_SWITCH_ATTR_ACL_STAGE_$STAGE attribute using setReadOnlyAttribute mechanism in VS test infrastructure and restart orchagent to make it reconstruct its capability map; - - Create ACL rule wich is not supported and verify no entry in ASIC DB; + - Create ACL rule which is not supported and verify no entry in ASIC DB; ### system level testing diff --git a/doc/acl/egress_outer_dscp_change_table.md b/doc/acl/egress_outer_dscp_change_table.md index cf775183e81..36472979816 100644 --- a/doc/acl/egress_outer_dscp_change_table.md +++ b/doc/acl/egress_outer_dscp_change_table.md @@ -74,7 +74,7 @@ The recommended design approach would translate the new defined tables into thre ### Requirements -1. Support DSCP change of the outer header for an encapsulated IPv4/IPv6 packets. The orignal packet may be IPv4/IPv6 as well, and the DSCP change for both should be supported. +1. Support DSCP change of the outer header for an encapsulated IPv4/IPv6 packets. The original packet may be IPv4/IPv6 as well, and the DSCP change for both should be supported. 2. Match Criteria supported should be the same as standard L3 IPv4/IPv6 match criteria. 3. Platform capabilities should be utilized to determine if these tables can be created and the supported range of metadata values. 4. The solution should be extendable in a way that more action types can be added later. @@ -164,7 +164,7 @@ This approach is similar to Option A with the exception that instead of creating * | Match Type | Actions | Bind | * | | | points | * |----------------------------------------------------| - * | No additionnal Match | SET_METADATA*| | + * | No additional Match | SET_METADATA*| | * |----------------------------------------------------| * L3V6 Table augmentation @@ -173,7 +173,7 @@ This approach is similar to Option A with the exception that instead of creating * | Match Type | Actions | Bind | * | | | points | * |----------------------------------------------------| - * | No additionnal Match | SET_METADATA*| | + * | No additional Match | SET_METADATA*| | * |----------------------------------------------------| * EGR_SET_DSCP @@ -286,8 +286,8 @@ In addition, by keeping track of DSCP and metadata values, the entries in the EG ##### Metadata Management Since not all platforms support metadata matching and action, the implementation would use SAI capability check to see if metadata action and match attributes are supported. In addition, the range of the metadata field would be checked by 'SAI_SWITCH_ATTR_ACL_USER_META_DATA_RANGE'. -The orcahagent would check if a DSCP action value has an assigned metadata value, thereby reusing the same value and EGR_SET_DSCP entry. If a DSCP value has no associated metadata value, a new one would be allcoated along with EGR_SET_DSCP entry. -Since these metadata values would be reused accross differnt ACLs and for both V4 and V6 type entries, Each metadata, DSCP combition would be reference counted. +The orcahagent would check if a DSCP action value has an assigned metadata value, thereby reusing the same value and EGR_SET_DSCP entry. If a DSCP value has no associated metadata value, a new one would be allocated along with EGR_SET_DSCP entry. +Since these metadata values would be reused across different ACLs and for both V4 and V6 type entries, Each metadata, DSCP combition would be reference counted. During the processing of an ACL, if metadata allocation fails for an entry, the orchagent would return failure for that and subsequent entries and the user would have to remove the complete ACL. @@ -296,7 +296,7 @@ During the processing of an ACL, if metadata allocation fails for an entry, the The EGR_SET_DSCP table would be created and managed by the orchagent internally. Only a single instance of this table would be created upon creation of UNDERLAY_SET_DSCP/V6 table and subsequent tables of type UNDERLAY_SET_DSCP/V6 would reuse this instance. By employing ref-counting this table would be retained until the last user of this table is removed. Entries in this table would also be ref-counted as mnetioned in above section. The interface association of this table can be implemented in two ways. - 1) EGR_SET_DSCP association with all dataplane interfaces. This approach is resource intensive and would require changes in portsOrch to accomodate exposing all ports to the AclOrch. However this allows us to decouple the ingress stage table port association from EGR_SET_DSCP. + 1) EGR_SET_DSCP association with all dataplane interfaces. This approach is resource intensive and would require changes in portsOrch to accommodate exposing all ports to the AclOrch. However this allows us to decouple the ingress stage table port association from EGR_SET_DSCP. 2) EGR_SET_DSCP be associated with a superset of all interfaces which are associated with tables referencing it. e.g. if UNDERLAY_SET_DSCP is associated with interfaces [a,b,c] and UNDERLAY_SET_DSCPV6 is associated with [c,d,e] then EGR_SET_DSCP would be associated with [a,b,c,d,e]. This gives the user of the feature the choice to limit port association when needed. Based on common use case pattern, the tables would almost always be associated with all data-plane interfaces. @@ -474,7 +474,7 @@ There is no impact on warmboot or fastboot. - Verify ACL Capability in STATE_DB on non-supported platforms. - The new field `supported_UnderlaySetDSCPV6` must be false. - Verify ACL tables creation and deletion - - The UNDERLAY_SET_DSCP and UNDERLAY_SET_DSCPV6 tables can be created and deleted togther and separately. + - The UNDERLAY_SET_DSCP and UNDERLAY_SET_DSCPV6 tables can be created and deleted together and separately. - Verify that both UNDERLAY_SET_DSCP/V6 can be bound to same ports as well as separate ports. - The EGR_SET_DSCP table is only created once and is bound to the superset of the ports for both UNDERLAY_SET_DSCP/V6 and removed when all referencing tables are removed. - Verify that for multiple tables, different order of creation and removal should not have adverse affect. @@ -483,9 +483,9 @@ There is no impact on warmboot or fastboot. - If multiple rules share the same SET_DSCP action value only one entry should be created. Also EGR_SET_DSP rule should be created with first entry and removed when last rule is removed. - Verify a single EGR_SET_DSP entry is created per a unique DSCP action value accorss different tables. - If multiple rules share the same SET_DSCP action value only one entry should be created. Also EGR_SET_DSP rule should be created with first entry and removed when last rule is removed. -- Verfiy metadata allocation and deallocation +- Verify metadata allocation and deallocation - freed metadata values should be available for next allocation. -- Verfiy metadata allcoation failure. +- Verify metadata allocation failure. - when metadata is exhausted, this should result in failure. @@ -495,13 +495,13 @@ There is no impact on warmboot or fastboot. * UNDERLAY_SET_DSCP/V6 creation failure test. - Create multiple tables so that tcam is full and ensure that UNDERLAY_SET_DSCP/V6 creation fails gracefully. * UNDERLAY_SET_DSCP/V6 traffic tests with tunneled packets. - - Verify outer DSCP value change after packets are encapsualted. + - Verify outer DSCP value change after packets are encapsulated. * UNDERLAY_SET_DSCP/V6 traffic tests with non-tunneled packets. - Verify DSCP value change even when packets are not encapsulated. * Metadata exhaustion with traffic test. - - Verfiy metadata exhaustion results in graceful acl failure. + - Verify metadata exhaustion results in graceful acl failure. * UNDERLAY_SET_DSCP/V6 multi-creation test with traffic. - - Verfiy metadata value leak doesnt happen by creating and removing UNDERLAY_SET_DSCP/V6 entries multiple times in differnet order. Ensure proper function in each iteration with traffic. + - Verify metadata value leak doesnt happen by creating and removing UNDERLAY_SET_DSCP/V6 entries multiple times in different order. Ensure proper function in each iteration with traffic. ### Addendum ### Changes proposed by the Community to be introduced in the next iteration diff --git a/doc/asic_thermal_monitoring_hld.md b/doc/asic_thermal_monitoring_hld.md index 57a4af17f12..333bff16f6c 100644 --- a/doc/asic_thermal_monitoring_hld.md +++ b/doc/asic_thermal_monitoring_hld.md @@ -1,135 +1,135 @@ -# ASIC thermal monitoring High Level Design -### Rev 0.2 -## Table of Contents - -## 1. Revision -Rev | Rev Date | Author | Change Description ----------|--------------|-----------|------------------- -|v0.1 |01/10/2019 |Padmanabhan Narayanan | Initial version -|v0.2 |10/07/2020 |Padmanabhan Narayanan | Update based on review comments and addess Multi ASIC scenario. -|v0.3 |10/15/2020 |Padmanabhan Narayanan | Update Section 6.3 to indicate no change in thermalctld or Platform API definitions. - -## 2. Scope -ASICs typically have multiple internal thermal sensors. This document describes the high level design of a poller for ASIC thermal sensors. It details how the poller may be configured and the export of thermal values from SAI to the state DB. - -## 3. Definitions/Abbreviations - -Definitions/Abbreviation|Description -------------------------|----------- -ASIC|Application Specific Integrated Circuit. In SONiC context ASIC refers to the NPU/MAC. -PCIe|Peripheral Component Interconnect express -SAI| Switch Abstraction Interface -SDK| Software Development Kit -NOS| Network Operating System - -## 4. Overview - -Networking switch platforms are populated with a number of thermal sensors which include exteral (i.e. onboard) as well as internal (those located within the component, e.g. CPU, ASIC, DIMM, Transceiver etc..) sensors. Readings from both the external as well as the internal sensors are essential inputs to the thermal/fan control algorithm so as to maintain optimal cooling. While drivers exist to retrive sensor values from onboard and other internal sensors, the ASIC based sensor values are currently retrieved thru the ASIC's SDK. SAI provides the following attributes to retrive the ASIC internal sensors: - -|Attribute|Description| -|---|------| -|SAI_SWITCH_ATTR_MAX_NUMBER_OF_TEMP_SENSORS| Maximum number of temperature sensors available in the ASIC | -|SAI_SWITCH_ATTR_TEMP_LIST|List of temperature readings from all sensors| -|SAI_SWITCH_ATTR_AVERAGE_TEMP|The average of temperature readings over all sensors in the switch| -|SAI_SWITCH_ATTR_MAX_TEMP|The current value of the maximum temperature retrieved from the switch sensors| - -A configurable ASIC sensors poller is introduced that periodically retrieves the ASIC internal sensor values via SAI APIs and populates the state DB. These values may be used by the thermal control functions (via the ThermalBase() platform APIs), SNMP/CLI or other Telemetry purposes. - -## 5. Requirements - -### 5.1 Functional Requirements - -1. The ASIC sensors poller should be configurable using CONFIG DB (for each ASIC in a multi ASIC platform): - * There should be a way to enable/disable the poller - * The polling interval should be configurable (from 5 to 300 secs) -2. The retrieved values should be written to the STATE DB (of each ASIC's DB instance in a multi ASIC platform). -3. The ASIC internal sensor values retrieved should be useable by the Thermal Control infrastructure (https://github.com/sonic-net/SONiC/blob/master/thermal-control-design.md). - -### 5.2 CLI requirements - -"show platform temperature" should additionally display the ASIC internal sensors as well. - -### 5.3 Platform API requirements - -It should be possible to query the ASIC internal sensors using the ThermalBase() APIs - -## 6. Module Design - -### 6.1 DB and Schema changes - -A new ASIC_SENSORS ConfigDB table entry would be added to each ASIC's database instance: -``` -; Defines schema for ASIC sensors configuration attributes -key = ASIC_SENSORS|ASIC_SENSORS_POLLER_STATUS ; Poller admin status -; field = value -admin_status = "enable"/"disable" - -key = ASIC_SENSORS|ASIC_SENSORS_POLLER_INTERVAL ; Poller interval in seconds -; field = value -interval = 1*3DIGIT -``` - -IN each ASIC's stateDB instance, a new ASIC_TEMPERATORE_INFO table will be added to hold the ASIC internal temperatures: - -``` -; Defines thermal information for an ASIC -key = ASIC_TEMPERATURE_INFO -; field = value -average_temperature = FLOAT ; current average temperature value -maximum_temperature = FLOAT ; maximum temperature value -temperature_0 = FLOAT ; ASIC internal sensor 0 temperature value -... -temperature_N = FLOAT ; ASIC internal sensor N temperature value -``` - -### 6.2 SwitchOrch changes - -Apart from APP_SWITCH_TABLE_NAME, SwitchOrch will also be a consumer of CFG_ASIC_SENSORS_TABLE_NAME ("ASIC_SENSORS") to process changes to the poller configuration. A new SelectableTimer (sensorsPollerTimer) is introduced with a default of 10 seconds. - -#### 6.2.1 Poller Configuration - -* If the admin_status is enabled, the sensorsPollerTimer is started. If the poller is disabled, a flag is set so that upon the next timer callback, the timer is stopped. -* If there is any change in the polling interval, the sensorsPollerTimer is updated so that the new interval with take effect with the next timer callback. - -#### 6.2.2 sensorsPollerTimer - -In the timer callback, the following actions are performed: - -* Handle change to timer disable : if the user disables the timer, timer is stopped. -* Handle change to the polling interval : reset the timer if the polling interval has changed -* Get SAI_SWITCH_ATTR_TEMP_LIST and update the ASIC_TEMPERATURE_INFO in the stateDB. -* If the ASIC SAI supports SAI_SWITCH_ATTR_AVERAGE_TEMP, query and update the average temperature field in the ASIC_TEMPERATURE_INFO table in the stateDB. -* If the ASIC SAI supports SAI_SWITCH_ATTR_MAX_TEMP, query and update the maximum_temperature field in the ASIC_TEMPERATURE_INFO table in the stateDB. - -### 6.3 Platform changes to support ASIC Thermals - -Platform owners typically provide the implementation for Thermals (https://github.com/sonic-net/sonic-platform-common/blob/master/sonic_platform_base/thermal_base.py). While there is no change in existing Platform API definitions, apart from external/CPU sensors, platform vendors should also include ASIC internal sensors in the _thermal_list[] of the Chassis / Module implementations. - -Assuming a Multi ASIC Chassis with 3 ASICs, the thermal names could be: -ASIC0 Internal 0, ... ASIC0 Internal N0, ASIC1 Internal 0, ... ASIC1 Internal N1, ASIC2 Internal 0, ... ASIC2 Internal N2 -where ASIC0, ASIC1 and ASIC2 have N0, N1 and N2 internal sensors respectively. - -The implementation of the APIs get_high_threshold(), get_low_threshold(), get_high_critical_threshold(), get_name(), get_presence() etc.. are platform (ASIC) specific. The get_temperature() should retrieve the temperature from the ASIC_TEMPERATURE_INFO table of the stateDB from the concerned ASIC's DB instance (which is populated by the SwitchOrch poller as described [above](#62-switchorch-changes)). - -The thermalctld's TemperatureUpdater::_refresh_temperature_status() retreives the temperatures of the ASIC internal sensors from the get_temperature() API - just as it would for any external sensor. Only that in the case of ASIC internal sensors, the get_temperature() API is going to retrieve and return the value from from ASIC_TEMPERATURE_INFO table. The thermalctld also updates these values to the TEMPERATURE_INFO table in the globalDB's stateDB. Thus, there is no change in the existing thermalctld infrastructure. - -## 7 Virtual Switch - -NA - -## 8 Restrictions - -1. Unlike external sensors, ASIC's internal sensors are retrievable only thru the SDK/SAI. The proposed design eliminates the need for pmon from having to make SAI calls. Considering that thermalctld's default UPDATE_INTERVAL is 60 seconds, the ASIC_SENSORS_POLLER_INTERVAL should ideally be set to an appropriate lower value for better convergence. -2. A CLI is not currently defined for the poller configuration (for setting/getting the Poller admin state and interval configuration). - -## 9 Unit Test cases -Unit test case one-liners are given below: - -| # | Test case synopsis | Expected results | -|-------|----------------------|------------------| -|1| Set "ASIC_SENSORS\|ASIC_SENSORS_POLLER_STATUS" "admin_status" to "enable" for a specific ASIC instance | Check that ASIC internal sensors are dumped periodically in the ASIC_TEMPERATURE_INFO of the ASIC's stateDB instance and to the globalDB's TEMPERATURE_INFO table -|2| Set "ASIC_SENSORS\|ASIC_SENSORS_POLLER_STATUS" "admin_status" to "disable" for a specific ASIC instance | Check that the poller stops -|3| Set "ASIC_SENSORS\|ASIC_SENSORS_POLLER_INTERVAL" "interval" to "30" for a specific ASIC instance | Check that the poller interval changes from the default 10 seconds -|4| Issue "show platform temperature" | Check that the ASIC interal temperatures are displayed for all the ASICs - -## 10 Action items +# ASIC thermal monitoring High Level Design +### Rev 0.2 +## Table of Contents + +## 1. Revision +Rev | Rev Date | Author | Change Description +---------|--------------|-----------|------------------- +|v0.1 |01/10/2019 |Padmanabhan Narayanan | Initial version +|v0.2 |10/07/2020 |Padmanabhan Narayanan | Update based on review comments and address Multi ASIC scenario. +|v0.3 |10/15/2020 |Padmanabhan Narayanan | Update Section 6.3 to indicate no change in thermalctld or Platform API definitions. + +## 2. Scope +ASICs typically have multiple internal thermal sensors. This document describes the high level design of a poller for ASIC thermal sensors. It details how the poller may be configured and the export of thermal values from SAI to the state DB. + +## 3. Definitions/Abbreviations + +Definitions/Abbreviation|Description +------------------------|----------- +ASIC|Application Specific Integrated Circuit. In SONiC context ASIC refers to the NPU/MAC. +PCIe|Peripheral Component Interconnect express +SAI| Switch Abstraction Interface +SDK| Software Development Kit +NOS| Network Operating System + +## 4. Overview + +Networking switch platforms are populated with a number of thermal sensors which include external (i.e. onboard) as well as internal (those located within the component, e.g. CPU, ASIC, DIMM, Transceiver etc..) sensors. Readings from both the external as well as the internal sensors are essential inputs to the thermal/fan control algorithm so as to maintain optimal cooling. While drivers exist to retrieve sensor values from onboard and other internal sensors, the ASIC based sensor values are currently retrieved thru the ASIC's SDK. SAI provides the following attributes to retrieve the ASIC internal sensors: + +|Attribute|Description| +|---|------| +|SAI_SWITCH_ATTR_MAX_NUMBER_OF_TEMP_SENSORS| Maximum number of temperature sensors available in the ASIC | +|SAI_SWITCH_ATTR_TEMP_LIST|List of temperature readings from all sensors| +|SAI_SWITCH_ATTR_AVERAGE_TEMP|The average of temperature readings over all sensors in the switch| +|SAI_SWITCH_ATTR_MAX_TEMP|The current value of the maximum temperature retrieved from the switch sensors| + +A configurable ASIC sensors poller is introduced that periodically retrieves the ASIC internal sensor values via SAI APIs and populates the state DB. These values may be used by the thermal control functions (via the ThermalBase() platform APIs), SNMP/CLI or other Telemetry purposes. + +## 5. Requirements + +### 5.1 Functional Requirements + +1. The ASIC sensors poller should be configurable using CONFIG DB (for each ASIC in a multi ASIC platform): + * There should be a way to enable/disable the poller + * The polling interval should be configurable (from 5 to 300 secs) +2. The retrieved values should be written to the STATE DB (of each ASIC's DB instance in a multi ASIC platform). +3. The ASIC internal sensor values retrieved should be usable by the Thermal Control infrastructure (https://github.com/sonic-net/SONiC/blob/master/thermal-control-design.md). + +### 5.2 CLI requirements + +"show platform temperature" should additionally display the ASIC internal sensors as well. + +### 5.3 Platform API requirements + +It should be possible to query the ASIC internal sensors using the ThermalBase() APIs + +## 6. Module Design + +### 6.1 DB and Schema changes + +A new ASIC_SENSORS ConfigDB table entry would be added to each ASIC's database instance: +``` +; Defines schema for ASIC sensors configuration attributes +key = ASIC_SENSORS|ASIC_SENSORS_POLLER_STATUS ; Poller admin status +; field = value +admin_status = "enable"/"disable" + +key = ASIC_SENSORS|ASIC_SENSORS_POLLER_INTERVAL ; Poller interval in seconds +; field = value +interval = 1*3DIGIT +``` + +IN each ASIC's stateDB instance, a new ASIC_TEMPERATORE_INFO table will be added to hold the ASIC internal temperatures: + +``` +; Defines thermal information for an ASIC +key = ASIC_TEMPERATURE_INFO +; field = value +average_temperature = FLOAT ; current average temperature value +maximum_temperature = FLOAT ; maximum temperature value +temperature_0 = FLOAT ; ASIC internal sensor 0 temperature value +... +temperature_N = FLOAT ; ASIC internal sensor N temperature value +``` + +### 6.2 SwitchOrch changes + +Apart from APP_SWITCH_TABLE_NAME, SwitchOrch will also be a consumer of CFG_ASIC_SENSORS_TABLE_NAME ("ASIC_SENSORS") to process changes to the poller configuration. A new SelectableTimer (sensorsPollerTimer) is introduced with a default of 10 seconds. + +#### 6.2.1 Poller Configuration + +* If the admin_status is enabled, the sensorsPollerTimer is started. If the poller is disabled, a flag is set so that upon the next timer callback, the timer is stopped. +* If there is any change in the polling interval, the sensorsPollerTimer is updated so that the new interval with take effect with the next timer callback. + +#### 6.2.2 sensorsPollerTimer + +In the timer callback, the following actions are performed: + +* Handle change to timer disable : if the user disables the timer, timer is stopped. +* Handle change to the polling interval : reset the timer if the polling interval has changed +* Get SAI_SWITCH_ATTR_TEMP_LIST and update the ASIC_TEMPERATURE_INFO in the stateDB. +* If the ASIC SAI supports SAI_SWITCH_ATTR_AVERAGE_TEMP, query and update the average temperature field in the ASIC_TEMPERATURE_INFO table in the stateDB. +* If the ASIC SAI supports SAI_SWITCH_ATTR_MAX_TEMP, query and update the maximum_temperature field in the ASIC_TEMPERATURE_INFO table in the stateDB. + +### 6.3 Platform changes to support ASIC Thermals + +Platform owners typically provide the implementation for Thermals (https://github.com/sonic-net/sonic-platform-common/blob/master/sonic_platform_base/thermal_base.py). While there is no change in existing Platform API definitions, apart from external/CPU sensors, platform vendors should also include ASIC internal sensors in the _thermal_list[] of the Chassis / Module implementations. + +Assuming a Multi ASIC Chassis with 3 ASICs, the thermal names could be: +ASIC0 Internal 0, ... ASIC0 Internal N0, ASIC1 Internal 0, ... ASIC1 Internal N1, ASIC2 Internal 0, ... ASIC2 Internal N2 +where ASIC0, ASIC1 and ASIC2 have N0, N1 and N2 internal sensors respectively. + +The implementation of the APIs get_high_threshold(), get_low_threshold(), get_high_critical_threshold(), get_name(), get_presence() etc.. are platform (ASIC) specific. The get_temperature() should retrieve the temperature from the ASIC_TEMPERATURE_INFO table of the stateDB from the concerned ASIC's DB instance (which is populated by the SwitchOrch poller as described [above](#62-switchorch-changes)). + +The thermalctld's TemperatureUpdater::_refresh_temperature_status() retrieves the temperatures of the ASIC internal sensors from the get_temperature() API - just as it would for any external sensor. Only that in the case of ASIC internal sensors, the get_temperature() API is going to retrieve and return the value from from ASIC_TEMPERATURE_INFO table. The thermalctld also updates these values to the TEMPERATURE_INFO table in the globalDB's stateDB. Thus, there is no change in the existing thermalctld infrastructure. + +## 7 Virtual Switch + +NA + +## 8 Restrictions + +1. Unlike external sensors, ASIC's internal sensors are retrievable only thru the SDK/SAI. The proposed design eliminates the need for pmon from having to make SAI calls. Considering that thermalctld's default UPDATE_INTERVAL is 60 seconds, the ASIC_SENSORS_POLLER_INTERVAL should ideally be set to an appropriate lower value for better convergence. +2. A CLI is not currently defined for the poller configuration (for setting/getting the Poller admin state and interval configuration). + +## 9 Unit Test cases +Unit test case one-liners are given below: + +| # | Test case synopsis | Expected results | +|-------|----------------------|------------------| +|1| Set "ASIC_SENSORS\|ASIC_SENSORS_POLLER_STATUS" "admin_status" to "enable" for a specific ASIC instance | Check that ASIC internal sensors are dumped periodically in the ASIC_TEMPERATURE_INFO of the ASIC's stateDB instance and to the globalDB's TEMPERATURE_INFO table +|2| Set "ASIC_SENSORS\|ASIC_SENSORS_POLLER_STATUS" "admin_status" to "disable" for a specific ASIC instance | Check that the poller stops +|3| Set "ASIC_SENSORS\|ASIC_SENSORS_POLLER_INTERVAL" "interval" to "30" for a specific ASIC instance | Check that the poller interval changes from the default 10 seconds +|4| Issue "show platform temperature" | Check that the ASIC internal temperatures are displayed for all the ASICs + +## 10 Action items diff --git a/doc/auto_techsupport_and_coredump_mgmt.md b/doc/auto_techsupport_and_coredump_mgmt.md index 97704af41b0..d80938c2927 100644 --- a/doc/auto_techsupport_and_coredump_mgmt.md +++ b/doc/auto_techsupport_and_coredump_mgmt.md @@ -225,7 +225,7 @@ module sonic-auto_techsupport { The actual value in bytes is calculate based on the available space in the filesystem hosting /var/dump When the limit is crossed, the older core files are incrementally deleted */ - description "Max Limit in percentage for the cummulative size of ts dumps. No cleanup is performed if the value isn't configured or is 0.0"; + description "Max Limit in percentage for the cumulative size of ts dumps. No cleanup is performed if the value isn't configured or is 0.0"; type decimal-repr; } @@ -236,7 +236,7 @@ module sonic-auto_techsupport { The actual value in bytes is calculated based on the available space in the filesystem hosting /var/core When the limit is crossed, the older core files are deleted */ - description "Max Limit in percentage for the cummulative size of core dumps. No cleanup is performed if the value isn't congiured or is 0.0"; + description "Max Limit in percentage for the cumulative size of core dumps. No cleanup is performed if the value isn't congiured or is 0.0"; type decimal-repr; } @@ -452,13 +452,13 @@ root-overlay 32896880 5460768 25742008 18% / ``` /var/core & /var/dump directories are hosted on root-overlay filesystem and this usually ranges from 10G to 25G+. -A default value of 5% would amount to a minimum of 500 MB which is a already a decent space for coredumps. For techsupport a default value of 10% would amount to a minium of 1G, which might accomodate from 5-10 techsupports. +A default value of 5% would amount to a minimum of 500 MB which is a already a decent space for coredumps. For techsupport a default value of 10% would amount to a minimum of 1G, which might accommodate from 5-10 techsupports. Although if the admin feels otherwise, these values are configurable. ### 7.8 Techsupport Locking -Recently, an enhancement was made to techsupport script to only run one instance at a time by using a locking mechanism. When other script instance of techsupport tries to run, it'll exit with a relevent code. This would apply nevertheless of how a techsupport was invoked i.e. manual or through auto-techsupport. +Recently, an enhancement was made to techsupport script to only run one instance at a time by using a locking mechanism. When other script instance of techsupport tries to run, it'll exit with a relevant code. This would apply nevertheless of how a techsupport was invoked i.e. manual or through auto-techsupport. With this change, rate-limit-interval of zero would not make any difference. The locking mechanism would implicitly impose a minimum rate-limit-interval of techsupport execution time. And since, the techsupport execution time can't be found out and varies based on underlying machine and system state, the range of values configurable for the rate-limit-interval is left unchanged @@ -470,7 +470,7 @@ Enhance the existing techsupport sonic-mgmt test with the following cases. | S.No | Test case synopsis | |------|-----------------------------------------------------------------------------------------------------------------------------------------| -| 1 | Check if the `coredump_gen_handler` script is infact invoking the techsupport cmd, when configured | +| 1 | Check if the `coredump_gen_handler` script is in fact invoking the techsupport cmd, when configured | | 2 | Check if the techsupport cleanup is working as expected | | 3 | Check if the global rate-& & per-process rate-limit-interval is working as expected | | 4 | Check if the core-dump cleanup is working as expected | @@ -482,7 +482,7 @@ The configuration in the init_cfg.json is loaded to the running config i.e. CONF ### 10 App Extension Considerations -Detailed Info related to Appliation Extension can be found here: https://github.com/sonic-net/SONiC/blob/master/doc/sonic-application-extension/sonic-application-extention-hld.md +Detailed Info related to Application Extension can be found here: https://github.com/sonic-net/SONiC/blob/master/doc/sonic-application-extension/sonic-application-extention-hld.md A new AUTO_TECHSUPPORT_FEATURE register/deregister option will be introduced. The existing FeatureRegistry class will be enahcned to also add/delete configuration related to AUTO_TECHSUPPORT_FEATURE table. diff --git a/doc/banner/banner_hld.md b/doc/banner/banner_hld.md index 82163d948ef..b81ee2f714c 100644 --- a/doc/banner/banner_hld.md +++ b/doc/banner/banner_hld.md @@ -109,7 +109,7 @@ This feature require access to SONiC DB. All messages (MOTD, login and logout) s ###### Figure 4: Banner show configuration The default Banner feature state is disabled. It means - the current (default) SONiC OS banner messages won't be changed. -With disabled feature state - user can use provided CLI to configre banner messages. The changes will be applied to Config DB table. +With disabled feature state - user can use provided CLI to configure banner messages. The changes will be applied to Config DB table. Only with enabled feature state, configured banner messages from Config DB will be applied to Linux. ## 2.3 CLI @@ -160,7 +160,7 @@ config banner login config banner logout ``` -**The following command set mesage of the day (MOTD):** +**The following command set message of the day (MOTD):** ```bash config banner motd ``` @@ -246,7 +246,7 @@ New YANG model `sonic-banner.yang` will be added to provide support for configur leaf logout { type string; - description "Banner message dispalyed to the users on logout"; + description "Banner message displayed to the users on logout"; default ""; } } /* end of container MESSAGE */ diff --git a/doc/barefoot_dtel/Dtel-SONiC.md b/doc/barefoot_dtel/Dtel-SONiC.md index faa16ecb950..90135c9f450 100644 --- a/doc/barefoot_dtel/Dtel-SONiC.md +++ b/doc/barefoot_dtel/Dtel-SONiC.md @@ -321,7 +321,7 @@ __Figure 4: Control flow for DTel events corresponding to ref-counted objects__. * Report session OID * Reference count -**Port hashmap (ports with atleast one queue on which reporting in enabled)** +**Port hashmap (ports with at least one queue on which reporting in enabled)** * Key: ifName * Value: diff --git a/doc/bfd/BFD_Enhancement_HLD.md b/doc/bfd/BFD_Enhancement_HLD.md index 8c580dd0197..05f1cc8fe3a 100644 --- a/doc/bfd/BFD_Enhancement_HLD.md +++ b/doc/bfd/BFD_Enhancement_HLD.md @@ -87,19 +87,19 @@ When BFD timer configuration is changed the packet stored in memory will be flus When a Rx packet is received with poll bit set, BFD will flush the stored Tx packet and a fresh packet will be sent until the negotiation is complete. ### 3.1.3 LAG support: -When deploying BFD over LAG interface it is expected that BFD session do not flap when a LAG member port flaps. BFD packets are send over a member port in the LAG based on the hashing in the kernel. When the port on which BFD packets are being sent goes down, BFD packets should seamlessly switchover to next available member port in LAG decided by hasing in the kernel. +When deploying BFD over LAG interface it is expected that BFD session do not flap when a LAG member port flaps. BFD packets are send over a member port in the LAG based on the hashing in the kernel. When the port on which BFD packets are being sent goes down, BFD packets should seamlessly switchover to next available member port in LAG decided by hashing in the kernel. One BFD session will be created per LAG irrespective of number of member port in the LAG. RFC 7130 does specifies creation of BFD session for each member port of LAG, but this will not be implemented. Supporting LAG is challenging in BFD due to time it may take for member port down event to reach control plane, in SONiC when a port is DOWN, the down event has to traverse up to ORCH agent and then back to the kernel, this may take considerable time. -In current SONiC implementation BFD relies on kernel network stack to switch the BFD packet to next available active port when a member port goes DOWN in LAG. BFD timers in this case is directly proportional to the time it takes for kernel to get the port down event. Faster the kernel learns port down the more aggressive BFD timers can be. In this case it is suggested to configure BFD timer values to have a timeout value of atleast 600 msec. +In current SONiC implementation BFD relies on kernel network stack to switch the BFD packet to next available active port when a member port goes DOWN in LAG. BFD timers in this case is directly proportional to the time it takes for kernel to get the port down event. Faster the kernel learns port down the more aggressive BFD timers can be. In this case it is suggested to configure BFD timer values to have a timeout value of at least 600 msec. ### 3.1.4 ECMP Support: For BFD multihop session there could be multiple nexthop to reach the destination, it is expected that BFD session do not flap when an active nexthop goes down, BFD session should seamlessly switchover to next available nexthop without bringing down the BFD session. Supporting ECMP is challenging in BFD due to time it may take for control plane to know that the active nexthop went down. In SONiC this information has to traverse all the way down to kernel after traversing all the DBs this may take considerable time. -In current SONiC implementation BFD relies on kernel network stack to switch the BFD packet to next available nexthop when a active nexthop goes down. BFD timers in this case is directly proportional to the time it takes for kernel to get the nexthop down event. Faster the kernel learns active nexthop is down the more aggressive BFD timers can be. In this case it is suggested to configure BFD timer values to have a timeout of atleast 600 msec. +In current SONiC implementation BFD relies on kernel network stack to switch the BFD packet to next available nexthop when a active nexthop goes down. BFD timers in this case is directly proportional to the time it takes for kernel to get the nexthop down event. Faster the kernel learns active nexthop is down the more aggressive BFD timers can be. In this case it is suggested to configure BFD timer values to have a timeout of at least 600 msec. ## 3.2 CLI ### 3.2.1 Data Models @@ -355,7 +355,7 @@ Unit test cases for this specification are as listed below: 47| |Verify BFD session timeout on all ECMP path DOWN. 48| |Verify BFD session timeout when a intermediate path is DOWN. ||BFD CLI| -49| |Verify CLI to cofigure BFD for BGP +49| |Verify CLI to configure BFD for BGP 50| |Verify CLI to configure transmit interval 51| |Verify CLI to configure receive interval 52| |Verify CLI to configure detection multiplier @@ -404,7 +404,7 @@ Unit test cases for this specification are as listed below: 95| |Verify CLI to display IPv4 multihop Peer. 96| |Verify CLI to display IPv6 multihop Peer. 97| |Verify config save and reload of BFD configuration. -98| |Verify unsaved config loss after relaod. +98| |Verify unsaved config loss after reload. ||BFD static peer| 99| |Verify BFD static IPv4 single hop peer establishment. 100| |Verify BFD static IPv4 multi hop peer stablishment. diff --git a/doc/bgp_error_handling/BGP_Route_Error_Handling_Arlo.md b/doc/bgp_error_handling/BGP_Route_Error_Handling_Arlo.md index 99c53610930..d2b6af41a97 100644 --- a/doc/bgp_error_handling/BGP_Route_Error_Handling_Arlo.md +++ b/doc/bgp_error_handling/BGP_Route_Error_Handling_Arlo.md @@ -84,7 +84,7 @@ Zebra, on receiving the message containing route install success, will notify BG Zebra, on receiving the message containing failed route notification, will withdraw the route from kernel. It will also mark the route with flag as "Not installed in hardware" and store the route. It will not send the next best route to fpmsyncd. At this stage, route is present in Zebra. It will NOT notify BGP of the route add failure. ### 3.3.2 BGP changes -When BGP learns a route, it marks the route as "pending FIB install" and sends the route to Zebra. The route may or may not be successfully installed in hardware. On receiving route add sucess notification message, BGP will remove the "pending FIB install" flag and advertise the route to its peers. +When BGP learns a route, it marks the route as "pending FIB install" and sends the route to Zebra. The route may or may not be successfully installed in hardware. On receiving route add success notification message, BGP will remove the "pending FIB install" flag and advertise the route to its peers. In case user wants to retry the installation of failed routes, he/she can issue the command in Zebra. The command will reprogram the failed route in kernel and send that route to hardware. If the route is successfully programmed in hardware, it will notify Zebra. Zebra will, in turn, notify BGP and route will be advertised to its neighbors. @@ -124,7 +124,7 @@ Commands: disable Administratively Disable BGP error-handling enable Administratively Enable BGP error handling ``` - When the error-handling is disabled, fpmsyncd will not subcribe to any notification from ERROR_ROUTE_TABLE. By default, the error-handling feature is disabled. During system reload, config replay for this feature is possible when the docker routing config mode is unified or split. + When the error-handling is disabled, fpmsyncd will not subscribe to any notification from ERROR_ROUTE_TABLE. By default, the error-handling feature is disabled. During system reload, config replay for this feature is possible when the docker routing config mode is unified or split. This feature can be turned off on demand. But it can affect the system stability. When the config was turned on, there may be some routes in BGP, for which, it is waiting for update from hardware. When the feature is turned off, we will unsubscribe from ERROR_DB and will no longer receive any notifications from hardware. Hence, some of the routes may not receive any notification from hardware. It is recommended to restart the BGP docker when the config state is changed to disable from enable. By default, this config is disabled. If the config is changed from disable to enable, we do not need to restart the docker. But the feature will be affecting only those routes which will be learnt after enabling the feature. diff --git a/doc/bgp_loading_optimization/bgp-loading-optimization-hld.md b/doc/bgp_loading_optimization/bgp-loading-optimization-hld.md index 6fdd74c1d0e..0649f09efdd 100644 --- a/doc/bgp_loading_optimization/bgp-loading-optimization-hld.md +++ b/doc/bgp_loading_optimization/bgp-loading-optimization-hld.md @@ -172,7 +172,7 @@ we increase pipeline size from the default 125 to 50k, which would decrease the #### Add a timer to help delay the flush -Current implementation let fpmsyncd flush the pipeline on every monitered event, then the downstream orchagent is always fed with data in a small batch. But from a performance perspective, orchagent would prefer those small data batch to come as a whole. +Current implementation let fpmsyncd flush the pipeline on every monitored event, then the downstream orchagent is always fed with data in a small batch. But from a performance perspective, orchagent would prefer those small data batch to come as a whole. So we choose to skip some of the event-triggered flush. diff --git a/doc/bmc/leakage_detection_hld.md b/doc/bmc/leakage_detection_hld.md index a4eb49026c6..799cedec63c 100644 --- a/doc/bmc/leakage_detection_hld.md +++ b/doc/bmc/leakage_detection_hld.md @@ -17,7 +17,7 @@ The leak alarm process is straightforward. The platform API first acquires the s A new object `LiquidCollingBase` will be added to the `sonic-platform-common` to reflect the new liquid cooling device ``` -Class LiquidcollingBase(ojbect): +Class LiquidcollingBase(object): leakge_sensors_num = 0 leakage_sensors = {} @@ -53,7 +53,7 @@ Class LiquidcollingBase(ojbect): return leaking_sensors Class LeakageSensor(sensor_base): - "" there might be mutiple leakge detection sensors, to let user better find the location, + "" there might be multiple leakge detection sensors, to let user better find the location, name = "" leaking = 0 @@ -72,7 +72,7 @@ During initialization, a separate thread will be launched to periodically call t New configuration will be added to pmon_daemon_control.json, to indicate whether the system has liquid cooling system, if not, the object and thread will not be created in the initialization of thermalctld at all to avoid performance overheading. ``` -# to enable the seperate thread for liquid cooling monitor +# to enable the separate thread for liquid cooling monitor enable_liquid_cooling: true, # set the interval to update the leakage status, default 0.5 liquid_cooling_update_interval: 0.5 @@ -94,7 +94,7 @@ class LiquidCoolingUpdater(): ``` ### stat_db data schema -the `LIQUID_COOLING_DEVICE` table stores all the date gathered by thermal control deamon, currtenly, it will have only `leakage_sensors` +the `LIQUID_COOLING_DEVICE` table stores all the date gathered by thermal control daemon, currtenly, it will have only `leakage_sensors` ``` Defines a logical structure for liquid cooling devices, with keys for various sensors. @@ -145,11 +145,11 @@ leak_sensors3 Not OK LiquidCooling ``` ## 8. Performance -Seperate thread will be lunched in thermal contorl daemon keep monitoring entire liquid cooling device status within 0.5s interval +Separate thread will be lunched in thermal control daemon keep monitoring entire liquid cooling device status within 0.5s interval ## 9. Testing A mock testing should be created to demonstrate the functionality of this implementation. Once simulated a leaking event, these things need to be checked: -1. correct sensors number had been indicated in the syslog messge +1. correct sensors number had been indicated in the syslog message 2. state db is rightly updated 3. GNMI event had been sent out 4. `show platform leakage status` command output is correct diff --git a/doc/bmp/bmp.md b/doc/bmp/bmp.md index 053ef62510b..45d119b1d73 100644 --- a/doc/bmp/bmp.md +++ b/doc/bmp/bmp.md @@ -216,7 +216,7 @@ admin@bjw-can-3800-1:~$ redis-cli -n 20 -p 6400 HGETALL "BGP_RIB_OUT_TABLE|192.1 ## 2.5 BMP Agent As [2.2 OpenBMP](#22-openbmp) shown, We need to fork and update code in [OpenBMPd](#https://github.com/SNAS/openbmp/tree/master/Server/). OpenBMP supports BMP protocol collecting by openbmpd agent. Thus in this project we will only need openbmpd agent role, and add redis population when monitoring BGP data from BGP container. -Below picture is referenced from [OpenBMPFlow](#https://www.openbmp.org/#openbmp-flow/), refer the part in red circle, which is the daemon we need to update in this porject. +Below picture is referenced from [OpenBMPFlow](#https://www.openbmp.org/#openbmp-flow/), refer the part in red circle, which is the daemon we need to update in this project. OPENBMP ARCHITECTURE @@ -268,7 +268,7 @@ Create below config items list for enabling and disabling different table. ### Full Dataset supported -[OpenBMP dataset](https://github.com/SNAS/openbmp/blob/master/docs/MESSAGE_BUS_API.md#message-api-parsed-data), we can find full dataset info as reference here which is Kafaka based TSV message, however, we will not follow it's format when populats the redis database, the data format we use is decalred [2.4 Database Design](#24-database-design) +[OpenBMP dataset](https://github.com/SNAS/openbmp/blob/master/docs/MESSAGE_BUS_API.md#message-api-parsed-data), we can find full dataset info as reference here which is Kafaka based TSV message, however, we will not follow it's format when populats the redis database, the data format we use is declared [2.4 Database Design](#24-database-design) ## 2.6 GNMI support diff --git a/doc/buffer-watermark/align_watermark_flow_with_port_configuration_HLD.md b/doc/buffer-watermark/align_watermark_flow_with_port_configuration_HLD.md index 930213103b0..3e9860e8e21 100644 --- a/doc/buffer-watermark/align_watermark_flow_with_port_configuration_HLD.md +++ b/doc/buffer-watermark/align_watermark_flow_with_port_configuration_HLD.md @@ -1,316 +1,316 @@ - -# Align watermark flow with port configuration HLD - -# High Level Design Document -## Rev 0.1 - -# Table of Contents -- [1. Revision](#1-revision) -- [2. Scope](#2-scope) -- [3. Motivation](#3-motivation) -- [4. Abbreviations](#4-abbreviations) -- [5. Introduction](#5-introduction) -- [6. HLD design](#6-hld-design) - - [6.1 The Requirement](#61-the-requirement) - - [6.2 Queue and PG maps](#62-queue-and-pg-maps) - - [6.3 PG and QUEUE map generation correct flows in Flexcounter](#63-pg-and-queue-map-generation-correct-flows-in-flexcounter) - - [6.4 Change of PG map and QUEUE map generation in flexcounter](#64-change-of-pg-map-and-queue-map-generation-in-flexcounter) - - [6.5 Effect upon disablement of counterpoll queue/watermark/pg-drop](#65-effect-upon-disablement-of-counterpoll-queue/watermark/pg-drop) -- [7. Suggested changes](#7-suggested-changes) - - [7.1 Current flow](#71-current-flow) - - [7.2 Suggested flow](#72-suggested-flow) - - [7.3 Addition of port when watermark is enabled](#73-addition-of-port-when-watermark-is-enabled) -- [8. Testing](#8-testing) - - [8.1 Manual testing](#81-manual-testing) - - [8.1.1 Regression testing](#811-test-flex-counter-logic) - - [8.1.2 Regression testing](#812-test-flex-counter-logic-with-reboot) - - [8.1.3 Regression testing](#813-adding-a-port-when-watermark-is-enabled) - - [8.2 VS tests](#82-vs-tests) - -# 1. Revision -| Rev | Date | Author | Change description | -|:----------:|:----------:|:--------------:|:----------------------:| -| 0.1 | 17/08/2022 | Doron Barashi | Initial version| -| | | | | - -# 2. Scope -This document provides high level design for alignment for queue, watermark and pg flows in flexcounter. - -# 3. Motivation -Fix wrong flows existing in flexcounter for queue, watermark and pg counter polling. - -# 4. Abbreviations - -| Term | Meaning | -|:--------:|:---------------------------------------------:| -| PG | Priority Group| -| PG-DROP | Priority Group drop (packets)| - -# 5. Introduction - -Currently, the queue map will not be generated if: -- the watermark counter polling is enabled -- the queue counter poll is disabled. - -This is because there is a missing logic in flexcounter regarding the queue and watermark handling: -- the queue map is generated only if queue counter polling is enabled -- the PG map is generated only if PG watermark polling is enabled - -This is not complete - -Assume the counter polling is disabled for all counters in CONFIG_DB when the system starts, -- Enable watermark counter polling only - - PG watermark works correctly - - *Queue watermark does not work because queue map isn’t generated* -- Enable pg-drop counter polling only - - *PG drop counter does not work because PG maps isn’t generated* - - also pg-watermark stats are not added since PG map generation isn't called -- Enable queue counter polling only - - Queue counter works correctly, *but also queue-watermark stats are added* - -The branches marked in italic are not expected and caused by the missing logic required in this FR. - -# 6. HLD design - -## 6.1 The Requirement - -- Flexcounter to generate the queue maps when queue or watermark counter is enabled for the 1st time, which means: - - queue and watermark polling were disabled - - queue or watermark polling is about to be set to true -- Flexcounter to generate PG maps when pg-drop or watermark counter is enabled for the 1st time, which means: - - PG and watermark polling were disabled - - PG or watermark polling is about to be set to true - -## 6.2 Queue and PG maps - -These are the Queue and PG maps in COUNTER_DB that should be created upon relevant counterpoll enablement - -Queue maps in COUNTERS_DB upon counterpoll queue or watermark enable (currently created only upon queue enable): -``` -COUNTERS_QUEUE_NAME_MAP -COUNTERS_QUEUE_INDEX_MAP -COUNTERS_QUEUE_PORT_MAP -COUNTERS_QUEUE_TYPE_MAP -``` - -PG maps in COUNTERS_DB upon counterpoll pg-drop or watermark enable (currently created only upon pg-watermark enable): -``` -COUNTERS_PG_NAME_MAP -COUNTERS_PG_PORT_MAP -COUNTERS_PG_INDEX_MAP -``` - -## 6.3 PG and QUEUE map generation correct flows in Flexcounter - -queue enable only: - -- QUEUE_STAT_COUNTER should be generated in FLEX_COUNTER_DB - -- QUEUE map should be generated in COUNTER_DB - -- no WATERMARK stats counters should be generated in flex COUNTER_DB - -***Example*** - -``` -1276) "FLEX_COUNTER_TABLE:QUEUE_STAT_COUNTER:oid:0x15000000000284" -``` - - -pg-drop enable only: - -- PG_DROP_STAT_COUNTER should be generated in FLEX_COUNTER_DB - -- PG map should be generated in COUNTER_DB - -- no *WATERMARK* stats counters should be generated in flex COUNTER_DB - -***Example*** - -``` -642) "FLEX_COUNTER_TABLE:PG_DROP_STAT_COUNTER:oid:0x1a00000000008f" -``` - - -watermark enable only: - -- PG_WATERMARK_STAT_COUNTER should be generated in FLEX_COUNTER_DB (db 5) -- QUEUE_WATERMARK_STAT_COUNTER should be generated in FLEX_COUNTER_DB (db 5) -- BUFFER_POOL_WATERMARK_STAT_COUNTER should be generated in FLEX_COUNTER_DB (db 5) - -- PG_WATERMARK should be generated in COUNTER_DB (db 2) -- QUEUE_WATERMARK should be generated in COUNTER_DB (db 2) -- BUFFER_POOL_WATERMARK should be generated in COUNTER_DB (db 2) - -***Example*** - -``` -641) "FLEX_COUNTER_TABLE:PG_WATERMARK_STAT_COUNTER:oid:0x1a0000000000d9" -642) "FLEX_COUNTER_TABLE:PG_DROP_STAT_COUNTER:oid:0x1a00000000008f" -``` - -``` -1276) "FLEX_COUNTER_TABLE:QUEUE_STAT_COUNTER:oid:0x15000000000284" -1277) "FLEX_COUNTER_TABLE:QUEUE_WATERMARK_STAT_COUNTER:oid:0x150000000002bc" -``` - -## 6.4 Change of PG map and QUEUE map generation in flexcounter - -QUEUE map generation will be separated into two flows: - -- QUEUE map geneartion -- adding QUEUE or QUEUE_WATERMARK stats to FLEX_COUNTER_DB depending on the counterpoll enabled (queue or watermark) - -PG map generation will be separated into two flows: - -- PG map geneartion -- adding PG_DROP or PG_WATERMARK stats to FLEX_COUNTER_DB depending on the counterpoll enabled (pg-drop or watermark) - -## 6.5 Effect upon disablement of counterpoll queue/watermark/pg-drop - -Current implementation doesn't remove any stats entries from the flexcounter tables in the database upon counterpoll disable. -No changes will be done for these flows as it's done this way by design. - - -# 7. Suggested changes - -## 7.1 Current flow - -When queue is enabled in counterpoll, a generateQueueMap function is called in flexcounter.cpp. -this function calls generateQueueMapPerPort per physical port. -inside this per port function both the queue map is created and queue-watermark stats are added. - -When pg-watermark is enabled in counterpoll, a generatePriorityGroupMap function is called in flexcounter.cpp. -this function will call generatePriorityGroupMapPerPort per physical port. -inside this per port function both the pg map is created and pg-watermark stats are added. - -***Example*** - -``` - else if(key == QUEUE_KEY) - { - gPortsOrch->generateQueueMap(); - } - else if(key == PG_WATERMARK_KEY) - { - gPortsOrch->generatePriorityGroupMap(); - } -``` - -## 7.2 Suggested flow - -queue and queue-watermark will be separated: - -When queue or watermark is enabled in counterpoll, a generateQueueMap function is called in flexcounter.cpp. -this function will call generateQueueMapPerPort per physical port. -inside this per port function only the queue map will be created. -queue stats creation will be separated into a different inner function that will be called separately if queue counterpoll is enabled. - -When only watermark is enabled in counterpoll, this block will call both generateQueueMap and addQueueWatermarkFlexCounters -which will call addQueueWatermarkFlexCountersPerPort per physical port. -inside these per port functions both the queue map will be created and queue-watermark stats will be added respectively. - -pg and pg-watermark will be separated: - -When pg-drop or watermark is enabled in counterpoll, a generatePriorityGroupMap function is called in flexcounter.cpp. -this function will call generatePriorityGroupMapPerPort per physical port. -inside this per port function only PG map will be created. -pg-drop stats creation will be separated into a different inner function that will be called separately if pg-drop counterpoll is enabled. - -When only watermark is enabled in counterpoll, this block will call both generatePriorityGroupMap abd addPriorityGroupWatermarkFlexCounters function -which will call addPriorityGroupWatermarkFlexCountersPerPort per physical port. -inside this per port functions both the PG map will be created and pg-watermark stats will be added respectively. - - -***Example*** - -``` - else if(key == QUEUE_KEY) - { - gPortsOrch->generateQueueMap(); - gPortsOrch->addQueueFlexCounters(); - } - else if(key == QUEUE_WATERMARK) - { - gPortsOrch->generateQueueMap(); - gPortsOrch->addQueueWatermarkFlexCounters(); - } - else if(key == PG_DROP_KEY) - { - gPortsOrch->generatePriorityGroupMap(); - gPortsOrch->addPriorityGroupDropFlexCounters(); - } - else if(key == PG_WATERMARK_KEY) - { - gPortsOrch->generatePriorityGroupMap(); - gPortsOrch->addPriorityGroupWatermarkFlexCounters(); - } -``` - -if queue or PG maps already created upon queue or pg-drop enablement and then watermark is enabled, -the queue or PG maps won't be created again. -this is done using the private boolean storing the created status. if it's already true, function returns. -the same mechanism will be done in the new watermark functions. - -***PG map Example*** - -``` - if (m_isPriorityGroupMapGenerated) - { - return; - } -``` - -## 7.3 Addition of port when watermark is enabled - -when watermark counterpoll is enabled and a certaion port will be added, the watermark stats for this port will be added. - -# 8. Testing - -## 8.1 Manual testing - -### 8.1.1 test flex counter logic -Test_flex_counter_logic(type=[watermark,queue,pg-drop]: -1. Disable the counter polling for all counters and save, reload configuration -2. Switch (type) - - watermark: enable the watermark counter polling, check whether PG and queue map are generated in COUNTER_DB and both PG and QUEUE WATERMARK stats exists in FLEX_COUNTER_DB - - queue: enable the queue counter polling, check whether queue map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB - - pg-drop: enable the pg-drop counter polling, check whether PG map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB - -repeat this test 3 times, each time with different type [watermark,queue,pg-drop] - -### 8.1.2 test flex counter logic with reboot -Test_flex_counter_logic(type=[watermark,queue,pg-drop] reboot after counterpoll enable: -1. Disable the counter polling for all counters and save, reload configuration -2. Switch (type) - - watermark: enable the watermark counter polling, check whether PG and queue map are generated in COUNTER_DB and both PG and QUEUE WATERMARK stats exists in FLEX_COUNTER_DB - - queue: enable the queue counter polling, check whether queue map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB - - pg-drop: enable the pg-drop counter polling, check whether PG map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB -3. reboot switch - Switch (type) - - watermark: watermark counter polling is enabled, check that PG and queue maps are generated in COUNTER_DB and both PG and QUEUE WATERMARK stats exists in FLEX_COUNTER_DB - - queue: queue counter polling is enabled, check whether queue map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB - - pg-drop: pg-drop counter polling is enabled, check whether PG map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB - -### 8.1.3 Adding a port when watermark is enabled - -1. disable a port -2. enable watermark counterpoll -3. verify this port stats does not exist in FLEX_COUNTERs_DB and COUNTERS_DB -4. enable the port previously disabled -5. verify this port stats is added to FLEX_COUNTERs_DB and COUNTERS_DB - -## 8.2 VS tests - -- Add queue-watermark and pg-drop to swss/test_flex_counters.py test - before this change only queue and pg-watermark are tested according to the old code implementation - -- Test_flex_counter_logic(type=[watermark,queue,pg-drop]: - -repeat this test 3 times, each time with different type [watermark,queue,pg-drop] -Test_flex_counter_logic(type=[watermark,queue,pg-drop] reboot after counterpoll enable: -1. Disable the counter polling for all counters and save, reload configuration -2. Switch (type) - - watermark: enable the watermark counter polling, check whether PG and queue map are generated in COUNTER_DB and both PG and QUEUE WATERMARK stats exists in FLEX_COUNTER_DB - - queue: enable the queue counter polling, check whether queue map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB - - pg-drop: enable the pg-drop counter polling, check whether PG map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB + +# Align watermark flow with port configuration HLD + +# High Level Design Document +## Rev 0.1 + +# Table of Contents +- [1. Revision](#1-revision) +- [2. Scope](#2-scope) +- [3. Motivation](#3-motivation) +- [4. Abbreviations](#4-abbreviations) +- [5. Introduction](#5-introduction) +- [6. HLD design](#6-hld-design) + - [6.1 The Requirement](#61-the-requirement) + - [6.2 Queue and PG maps](#62-queue-and-pg-maps) + - [6.3 PG and QUEUE map generation correct flows in Flexcounter](#63-pg-and-queue-map-generation-correct-flows-in-flexcounter) + - [6.4 Change of PG map and QUEUE map generation in flexcounter](#64-change-of-pg-map-and-queue-map-generation-in-flexcounter) + - [6.5 Effect upon disablement of counterpoll queue/watermark/pg-drop](#65-effect-upon-disablement-of-counterpoll-queue/watermark/pg-drop) +- [7. Suggested changes](#7-suggested-changes) + - [7.1 Current flow](#71-current-flow) + - [7.2 Suggested flow](#72-suggested-flow) + - [7.3 Addition of port when watermark is enabled](#73-addition-of-port-when-watermark-is-enabled) +- [8. Testing](#8-testing) + - [8.1 Manual testing](#81-manual-testing) + - [8.1.1 Regression testing](#811-test-flex-counter-logic) + - [8.1.2 Regression testing](#812-test-flex-counter-logic-with-reboot) + - [8.1.3 Regression testing](#813-adding-a-port-when-watermark-is-enabled) + - [8.2 VS tests](#82-vs-tests) + +# 1. Revision +| Rev | Date | Author | Change description | +|:----------:|:----------:|:--------------:|:----------------------:| +| 0.1 | 17/08/2022 | Doron Barashi | Initial version| +| | | | | + +# 2. Scope +This document provides high level design for alignment for queue, watermark and pg flows in flexcounter. + +# 3. Motivation +Fix wrong flows existing in flexcounter for queue, watermark and pg counter polling. + +# 4. Abbreviations + +| Term | Meaning | +|:--------:|:---------------------------------------------:| +| PG | Priority Group| +| PG-DROP | Priority Group drop (packets)| + +# 5. Introduction + +Currently, the queue map will not be generated if: +- the watermark counter polling is enabled +- the queue counter poll is disabled. + +This is because there is a missing logic in flexcounter regarding the queue and watermark handling: +- the queue map is generated only if queue counter polling is enabled +- the PG map is generated only if PG watermark polling is enabled + +This is not complete + +Assume the counter polling is disabled for all counters in CONFIG_DB when the system starts, +- Enable watermark counter polling only + - PG watermark works correctly + - *Queue watermark does not work because queue map isn’t generated* +- Enable pg-drop counter polling only + - *PG drop counter does not work because PG maps isn’t generated* + - also pg-watermark stats are not added since PG map generation isn't called +- Enable queue counter polling only + - Queue counter works correctly, *but also queue-watermark stats are added* + +The branches marked in italic are not expected and caused by the missing logic required in this FR. + +# 6. HLD design + +## 6.1 The Requirement + +- Flexcounter to generate the queue maps when queue or watermark counter is enabled for the 1st time, which means: + - queue and watermark polling were disabled + - queue or watermark polling is about to be set to true +- Flexcounter to generate PG maps when pg-drop or watermark counter is enabled for the 1st time, which means: + - PG and watermark polling were disabled + - PG or watermark polling is about to be set to true + +## 6.2 Queue and PG maps + +These are the Queue and PG maps in COUNTER_DB that should be created upon relevant counterpoll enablement + +Queue maps in COUNTERS_DB upon counterpoll queue or watermark enable (currently created only upon queue enable): +``` +COUNTERS_QUEUE_NAME_MAP +COUNTERS_QUEUE_INDEX_MAP +COUNTERS_QUEUE_PORT_MAP +COUNTERS_QUEUE_TYPE_MAP +``` + +PG maps in COUNTERS_DB upon counterpoll pg-drop or watermark enable (currently created only upon pg-watermark enable): +``` +COUNTERS_PG_NAME_MAP +COUNTERS_PG_PORT_MAP +COUNTERS_PG_INDEX_MAP +``` + +## 6.3 PG and QUEUE map generation correct flows in Flexcounter + +queue enable only: + +- QUEUE_STAT_COUNTER should be generated in FLEX_COUNTER_DB + +- QUEUE map should be generated in COUNTER_DB + +- no WATERMARK stats counters should be generated in flex COUNTER_DB + +***Example*** + +``` +1276) "FLEX_COUNTER_TABLE:QUEUE_STAT_COUNTER:oid:0x15000000000284" +``` + + +pg-drop enable only: + +- PG_DROP_STAT_COUNTER should be generated in FLEX_COUNTER_DB + +- PG map should be generated in COUNTER_DB + +- no *WATERMARK* stats counters should be generated in flex COUNTER_DB + +***Example*** + +``` +642) "FLEX_COUNTER_TABLE:PG_DROP_STAT_COUNTER:oid:0x1a00000000008f" +``` + + +watermark enable only: + +- PG_WATERMARK_STAT_COUNTER should be generated in FLEX_COUNTER_DB (db 5) +- QUEUE_WATERMARK_STAT_COUNTER should be generated in FLEX_COUNTER_DB (db 5) +- BUFFER_POOL_WATERMARK_STAT_COUNTER should be generated in FLEX_COUNTER_DB (db 5) + +- PG_WATERMARK should be generated in COUNTER_DB (db 2) +- QUEUE_WATERMARK should be generated in COUNTER_DB (db 2) +- BUFFER_POOL_WATERMARK should be generated in COUNTER_DB (db 2) + +***Example*** + +``` +641) "FLEX_COUNTER_TABLE:PG_WATERMARK_STAT_COUNTER:oid:0x1a0000000000d9" +642) "FLEX_COUNTER_TABLE:PG_DROP_STAT_COUNTER:oid:0x1a00000000008f" +``` + +``` +1276) "FLEX_COUNTER_TABLE:QUEUE_STAT_COUNTER:oid:0x15000000000284" +1277) "FLEX_COUNTER_TABLE:QUEUE_WATERMARK_STAT_COUNTER:oid:0x150000000002bc" +``` + +## 6.4 Change of PG map and QUEUE map generation in flexcounter + +QUEUE map generation will be separated into two flows: + +- QUEUE map generation +- adding QUEUE or QUEUE_WATERMARK stats to FLEX_COUNTER_DB depending on the counterpoll enabled (queue or watermark) + +PG map generation will be separated into two flows: + +- PG map generation +- adding PG_DROP or PG_WATERMARK stats to FLEX_COUNTER_DB depending on the counterpoll enabled (pg-drop or watermark) + +## 6.5 Effect upon disablement of counterpoll queue/watermark/pg-drop + +Current implementation doesn't remove any stats entries from the flexcounter tables in the database upon counterpoll disable. +No changes will be done for these flows as it's done this way by design. + + +# 7. Suggested changes + +## 7.1 Current flow + +When queue is enabled in counterpoll, a generateQueueMap function is called in flexcounter.cpp. +this function calls generateQueueMapPerPort per physical port. +inside this per port function both the queue map is created and queue-watermark stats are added. + +When pg-watermark is enabled in counterpoll, a generatePriorityGroupMap function is called in flexcounter.cpp. +this function will call generatePriorityGroupMapPerPort per physical port. +inside this per port function both the pg map is created and pg-watermark stats are added. + +***Example*** + +``` + else if(key == QUEUE_KEY) + { + gPortsOrch->generateQueueMap(); + } + else if(key == PG_WATERMARK_KEY) + { + gPortsOrch->generatePriorityGroupMap(); + } +``` + +## 7.2 Suggested flow + +queue and queue-watermark will be separated: + +When queue or watermark is enabled in counterpoll, a generateQueueMap function is called in flexcounter.cpp. +this function will call generateQueueMapPerPort per physical port. +inside this per port function only the queue map will be created. +queue stats creation will be separated into a different inner function that will be called separately if queue counterpoll is enabled. + +When only watermark is enabled in counterpoll, this block will call both generateQueueMap and addQueueWatermarkFlexCounters +which will call addQueueWatermarkFlexCountersPerPort per physical port. +inside these per port functions both the queue map will be created and queue-watermark stats will be added respectively. + +pg and pg-watermark will be separated: + +When pg-drop or watermark is enabled in counterpoll, a generatePriorityGroupMap function is called in flexcounter.cpp. +this function will call generatePriorityGroupMapPerPort per physical port. +inside this per port function only PG map will be created. +pg-drop stats creation will be separated into a different inner function that will be called separately if pg-drop counterpoll is enabled. + +When only watermark is enabled in counterpoll, this block will call both generatePriorityGroupMap and addPriorityGroupWatermarkFlexCounters function +which will call addPriorityGroupWatermarkFlexCountersPerPort per physical port. +inside this per port functions both the PG map will be created and pg-watermark stats will be added respectively. + + +***Example*** + +``` + else if(key == QUEUE_KEY) + { + gPortsOrch->generateQueueMap(); + gPortsOrch->addQueueFlexCounters(); + } + else if(key == QUEUE_WATERMARK) + { + gPortsOrch->generateQueueMap(); + gPortsOrch->addQueueWatermarkFlexCounters(); + } + else if(key == PG_DROP_KEY) + { + gPortsOrch->generatePriorityGroupMap(); + gPortsOrch->addPriorityGroupDropFlexCounters(); + } + else if(key == PG_WATERMARK_KEY) + { + gPortsOrch->generatePriorityGroupMap(); + gPortsOrch->addPriorityGroupWatermarkFlexCounters(); + } +``` + +if queue or PG maps already created upon queue or pg-drop enablement and then watermark is enabled, +the queue or PG maps won't be created again. +this is done using the private boolean storing the created status. if it's already true, function returns. +the same mechanism will be done in the new watermark functions. + +***PG map Example*** + +``` + if (m_isPriorityGroupMapGenerated) + { + return; + } +``` + +## 7.3 Addition of port when watermark is enabled + +when watermark counterpoll is enabled and a certain port will be added, the watermark stats for this port will be added. + +# 8. Testing + +## 8.1 Manual testing + +### 8.1.1 test flex counter logic +Test_flex_counter_logic(type=[watermark,queue,pg-drop]: +1. Disable the counter polling for all counters and save, reload configuration +2. Switch (type) + - watermark: enable the watermark counter polling, check whether PG and queue map are generated in COUNTER_DB and both PG and QUEUE WATERMARK stats exists in FLEX_COUNTER_DB + - queue: enable the queue counter polling, check whether queue map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB + - pg-drop: enable the pg-drop counter polling, check whether PG map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB + +repeat this test 3 times, each time with different type [watermark,queue,pg-drop] + +### 8.1.2 test flex counter logic with reboot +Test_flex_counter_logic(type=[watermark,queue,pg-drop] reboot after counterpoll enable: +1. Disable the counter polling for all counters and save, reload configuration +2. Switch (type) + - watermark: enable the watermark counter polling, check whether PG and queue map are generated in COUNTER_DB and both PG and QUEUE WATERMARK stats exists in FLEX_COUNTER_DB + - queue: enable the queue counter polling, check whether queue map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB + - pg-drop: enable the pg-drop counter polling, check whether PG map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB +3. reboot switch + Switch (type) + - watermark: watermark counter polling is enabled, check that PG and queue maps are generated in COUNTER_DB and both PG and QUEUE WATERMARK stats exists in FLEX_COUNTER_DB + - queue: queue counter polling is enabled, check whether queue map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB + - pg-drop: pg-drop counter polling is enabled, check whether PG map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB + +### 8.1.3 Adding a port when watermark is enabled + +1. disable a port +2. enable watermark counterpoll +3. verify this port stats does not exist in FLEX_COUNTERs_DB and COUNTERS_DB +4. enable the port previously disabled +5. verify this port stats is added to FLEX_COUNTERs_DB and COUNTERS_DB + +## 8.2 VS tests + +- Add queue-watermark and pg-drop to swss/test_flex_counters.py test + before this change only queue and pg-watermark are tested according to the old code implementation + +- Test_flex_counter_logic(type=[watermark,queue,pg-drop]: + +repeat this test 3 times, each time with different type [watermark,queue,pg-drop] +Test_flex_counter_logic(type=[watermark,queue,pg-drop] reboot after counterpoll enable: +1. Disable the counter polling for all counters and save, reload configuration +2. Switch (type) + - watermark: enable the watermark counter polling, check whether PG and queue map are generated in COUNTER_DB and both PG and QUEUE WATERMARK stats exists in FLEX_COUNTER_DB + - queue: enable the queue counter polling, check whether queue map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB + - pg-drop: enable the pg-drop counter polling, check whether PG map is generated in COUNTER_DB, no *WATERMARK* stats in FLEX_COUNTER_DB diff --git a/doc/buffer-watermark/align_watermark_flow_with_port_configuration_test_plan.md b/doc/buffer-watermark/align_watermark_flow_with_port_configuration_test_plan.md index f7969c7719e..a61fcf90641 100644 --- a/doc/buffer-watermark/align_watermark_flow_with_port_configuration_test_plan.md +++ b/doc/buffer-watermark/align_watermark_flow_with_port_configuration_test_plan.md @@ -49,7 +49,7 @@ The purpose of this test plan is to describe tests for Align watermark flow with ## Test information ### Supported topology -The test will be supported on any toplogy, since it doesn't require any traffic, +The test will be supported on any topology, since it doesn't require any traffic, except configuring port buffers since a performance improvement in queue and pg-drop maps creation was merged few days ago. ### Test configuration diff --git a/doc/buffer-watermark/watermarks_HLD.md b/doc/buffer-watermark/watermarks_HLD.md index c7600e27e85..02f606a1562 100644 --- a/doc/buffer-watermark/watermarks_HLD.md +++ b/doc/buffer-watermark/watermarks_HLD.md @@ -232,7 +232,7 @@ In addition clear functionality will be added: # clear queue [watermark|persistent-watermark] unicast -# clear queue [watermark|persistent-watermark] mutlicast +# clear queue [watermark|persistent-watermark] multicast ``` The user can clear the persistent watermark, and the "user" watermark. The user can not clear the periodic(telemetry) watermark. The clear command requires sudo, as the watermark is shared for @@ -240,7 +240,7 @@ all users, and clear will affect every user(if a number of people are connected #### 3.1.2.3 Show/configure telemetry interval -The telemetry interval will be available for viewing and configuring with the folowing CLI: +The telemetry interval will be available for viewing and configuring with the following CLI: ``` $ show watermark telemetry interval @@ -317,7 +317,7 @@ The sai APIs anf calls are: ### 3.1.9 gRPC -Sonic-telemetry will have acess to data in WATERMARK an HIGHEST_WATERMARK tables. For this the virtual db should be extended to access the said tables, virual path should should support mapping +Sonic-telemetry will have access to data in WATERMARK an HIGHEST_WATERMARK tables. For this the virtual db should be extended to access the said tables, virtual path should should support mapping ports to queues and priority groups. The exact syntax of the virtual paths is TBD. Examples of virtual paths: @@ -341,11 +341,11 @@ Examples of virtual paths: The core components are the flex counter, watermark orch, DB, CLI. -The flex counter reads and clears the watermarks on a peroid of 1s by default. The values are put directly to COUNTERS table. The flex counter also has plugins configured for queue and pg, which will be triggered on every flex counter group interval. The lua plugin will update PERIODIC_WATERMARKS, PERSISTENT_WATERMARKS and USER_WATERMARKS with if the new value exceeds the vlaue that was read from the table. +The flex counter reads and clears the watermarks on a period of 1s by default. The values are put directly to COUNTERS table. The flex counter also has plugins configured for queue and pg, which will be triggered on every flex counter group interval. The lua plugin will update PERIODIC_WATERMARKS, PERSISTENT_WATERMARKS and USER_WATERMARKS with if the new value exceeds the value that was read from the table. The watermark orch has 2 main functions: - Handle the Timer that clears the PERIODIC_WATERMARKS table. Handle the configuring of the interval for the timer. - - Handle Clear notificatons. On clear event the orch should just zero-out the corresponding watermarks from the table. It will be soon repopulated by lua plugin. + - Handle Clear notifications. On clear event the orch should just zero-out the corresponding watermarks from the table. It will be soon repopulated by lua plugin. The DB contains all the tables with watemarks, and the configuration table. diff --git a/doc/bulk_counter/bulk_counter.md b/doc/bulk_counter/bulk_counter.md index 4bae67fa0b9..eff41b5223c 100644 --- a/doc/bulk_counter/bulk_counter.md +++ b/doc/bulk_counter/bulk_counter.md @@ -121,7 +121,7 @@ Furthermore, the bulk chunk size can be configured on a per counter IDs set basi Each `COUNTER_NAME_PREFIX` defines a set of counter IDs by matching the counter IDs with the prefix. All the counter IDs in each set share a unified bulk chunk size and will be polled in a series of bulk counter polling API calls with the same counter IDs set but different port set. All such sets of counter IDs form a partition of counter IDs of the flex counter group. The partition of a flex counter group is represented by the keys of map `m_portBulkContexts`. -To simplify the logic, it is not supported to change the partition, which means it does not allow to split counter IDs into a differet sub sets once they have been split. +To simplify the logic, it is not supported to change the partition, which means it does not allow to split counter IDs into a different sub sets once they have been split. Eg. `SAI_PORT_STAT_IF_IN_FEC:32,SAI_PORT_STAT_IF_OUT_QLEN:0` represents @@ -189,7 +189,7 @@ The following new types will be introduced in `container FLEX_COUNTER_TABLE` of /} ``` -In the yang model, each flex counter group is an independent countainer. We will define leaf in the countainer `PG_DROP`, `PG_WATERMARK`, `PORT`, `QUEUE`, `QUEUE_WATERMARK`. +In the yang model, each flex counter group is an independent container. We will define leaf in the container `PG_DROP`, `PG_WATERMARK`, `PORT`, `QUEUE`, `QUEUE_WATERMARK`. The update of `PG_DROP` is shown as below ``` @@ -248,6 +248,6 @@ As this feature does not introduce any new function, unit test shall be good eno An example shows how smaller bulk chunk size helps PFC watchdog counter polling thread to be scheduled in time. -In the upper chart, the port counters are polled in a single bulk call which takes longer time. The PFC watchdog counter polling thread can not procceed until the long bulk call exits the critical section. +In the upper chart, the port counters are polled in a single bulk call which takes longer time. The PFC watchdog counter polling thread can not proceed until the long bulk call exits the critical section. In the lower chart, the port counters are polled in a series of bulk call with smaller bulk chunk sizes. The PFC watchdog counter polling thread has more chance to be scheduled in time. diff --git a/doc/bum_storm_control/bum_storm_control_hld.md b/doc/bum_storm_control/bum_storm_control_hld.md index 3eff1943131..6239977b511 100644 --- a/doc/bum_storm_control/bum_storm_control_hld.md +++ b/doc/bum_storm_control/bum_storm_control_hld.md @@ -87,7 +87,7 @@ This document describes the functionality and high level design of Broadcast, Un # 1 Feature Overview A traffic storm occurs when packets flood the LAN, creating excessive traffic and degrading network performance. The type of traffic can be Broadcast, Unknown-unicast or unknown-Multicast (BUM). The storm-control feature allows the user to limit the amount of BUM traffic admitted to the system. This can be achieved by configuring the type of storm (Broadcast or Unknown-unicast or unknown-Multicast) and the corresponding kilo bits per second (kbps) parameter on a given physical interface. Traffic that exceeds the configured rate will be dropped. -Unknown-multicast traffic consists of all multicast traffic which donot match any of the statically configured or dynamically learned multicast groups. +Unknown-multicast traffic consists of all multicast traffic which do not match any of the statically configured or dynamically learned multicast groups. diff --git a/doc/cbf/cbf_hld.md b/doc/cbf/cbf_hld.md index 17e6c730eae..4714dd27377 100644 --- a/doc/cbf/cbf_hld.md +++ b/doc/cbf/cbf_hld.md @@ -53,7 +53,7 @@ DSCP/EXP value of W for --> in the DSCP/EXP to FC --> assigned to the --> lookup destination D map table for W packet for destination D is a CBF group of Y based on the FC destination D value X ``` -This feature enables opeartors, among other things, to send the important (foreground) traffic through the shortest path, while sending the background traffic through longer paths to still give it some bandwidth instead of using QoS queues which may block background traffic from getting bandwitdh. +This feature enables operators, among other things, to send the important (foreground) traffic through the shortest path, while sending the background traffic through longer paths to still give it some bandwidth instead of using QoS queues which may block background traffic from getting bandwidth. These new class based next hop groups are allowed thanks to the changes in https://github.com/opencomputeproject/SAI/pull/1193, which allow a next hop group object to also have other next hop group objects as members of the group along with the next hop objects. The way such a next hop group works is that a packet which has a Forwarding Class value of X will be matched against an appropriate member of this group, selected based on the Forwarding Class value thanks to the "selection_map" property of the group. As an example, given the CBF group with members Nhg1, Nhg2 and Nhg3 and a selection map of FC 0 -> Nhg1, FC 1 -> Nhg2 and FC 3 -> Nhg3, a packet which has an FC value of 0 will be forwarded using Nhg1. Note that multiple FC values can point to the same member, but a single FC value can't be mapped to more than one member. diff --git a/doc/cli_auto_generation/cli_auto_generation.md b/doc/cli_auto_generation/cli_auto_generation.md index fc88a8a3826..bedac326159 100644 --- a/doc/cli_auto_generation/cli_auto_generation.md +++ b/doc/cli_auto_generation/cli_auto_generation.md @@ -136,10 +136,10 @@ The [manifest.json](https://github.com/stepanblyschak/SONiC/blob/sonic-app-ext-3 | Path | Type | Mandatory | Description | | --------------------------------- | ------ | --------- | ------------------------------------------------------------------------- | -| /cli/auto-generate-config | boolean| yes | ON/OFF triger for auto-generation of CLI command *config*. Default: false | -| /cli/auto-generate-show | boolean| yes | ON/OFF triger for auto-generation of CLI command *show*. Default: false | +| /cli/auto-generate-config | boolean| yes | ON/OFF trigger for auto-generation of CLI command *config*. Default: false | +| /cli/auto-generate-show | boolean| yes | ON/OFF trigger for auto-generation of CLI command *show*. Default: false | -By default, CLI is autogenerated for all YANG modules provided by the extension. Developer can optionally specify explicitelly which YANG modules to use for auto-generated CLI: +By default, CLI is autogenerated for all YANG modules provided by the extension. Developer can optionally specify explicitly which YANG modules to use for auto-generated CLI: | Path | Type | Mandatory | Description | | --------------------------------- | ------ | --------- | ------------------------------------------------------------------------- | @@ -296,7 +296,7 @@ ACS-MSN2100 UP r-sonic-switch x86_64-mlnx_msn2100-r0 ``` { "DEVICE_METADATA": { - "locahost": { + "localhost": { "hwsku": "ACS-MSN2100", "default_bgp_status": "up", "hostname": "r-sonic-switch", diff --git a/doc/cmis-module-enhancement/cmis-module-enhancement.md b/doc/cmis-module-enhancement/cmis-module-enhancement.md index f7005513b09..9cb75a75a25 100644 --- a/doc/cmis-module-enhancement/cmis-module-enhancement.md +++ b/doc/cmis-module-enhancement/cmis-module-enhancement.md @@ -11,7 +11,7 @@ ## 2. Scope -This section describes an enhancment of the synchronization between ASIC port and module configuration. +This section describes an enhancement of the synchronization between ASIC port and module configuration. ## 3. Definitions/Abbreviations @@ -26,7 +26,7 @@ These initialization processes should be synchronized and the configuration of C Currently, SONIC uses the "host_tx_ready" flag in the PORT table in STATE DB for synchronization. This flag is set by Ports OA right after the SAI API for setting the Admin status to UP returns with OK/Success status. PMON registers for changes of this flag in Redis DB and starts the CMIS initialization for a particular module when this flag is set. Current design has some gaps with synchronization between ASIC and module configuration. -This document purpose is to introduce an enhacement that will address the gaps and find a backward compatible solution. +This document purpose is to introduce an enhancement that will address the gaps and find a backward compatible solution. ## 5. Requirements @@ -37,7 +37,7 @@ This document purpose is to introduce an enhacement that will address the gaps a * Vendor SDK/FW shall support asynchronous notification of start/stop of sending high-speed signal from ASIC to module. -* PortsOrch shall set host_tx_ready in state DB, only when it recieved notification that the high-speed signal is sent. +* PortsOrch shall set host_tx_ready in state DB, only when it received notification that the high-speed signal is sent. ## 6. High-Level Design @@ -81,7 +81,7 @@ High Level Flow: #### 6.2.1. host_tx_signal This flow shall be used only on supporting platforms. -Hence, as part of PortsOrch initialization, SONiC will query SAI capabilities regarding the support of allowence flag for sending high-speed signal to module - It will be done by checking if SAI_PORT_ATTR_HOST_TX_SIGNAL_ENABLE is supported. +Hence, as part of PortsOrch initialization, SONiC will query SAI capabilities regarding the support of allowance flag for sending high-speed signal to module - It will be done by checking if SAI_PORT_ATTR_HOST_TX_SIGNAL_ENABLE is supported. In case SAI supports it, PortsOrch will start listening to TRANSCEIVER_INFO in State DB to know on any module plug event. Module's INSERTION/REMOVAL events shall trigger the calling of SAI API on a Port object with SAI_PORT_ATTR_HOST_TX_SIGNAL_ENABLE to enable or disable data signal from ASIC to module. diff --git a/doc/config-generic-update-rollback/Json_Change_Application_Design.md b/doc/config-generic-update-rollback/Json_Change_Application_Design.md index f5a914a2ff5..fb076d33ece 100644 --- a/doc/config-generic-update-rollback/Json_Change_Application_Design.md +++ b/doc/config-generic-update-rollback/Json_Change_Application_Design.md @@ -127,7 +127,7 @@ void apply-change(JsonChange jsonChange) |errors |malformedChangeError | Will be raised if the input JsonChange is not valid according to [SONiC_Generic_Config_Update_and_Rollback_Design](SONiC_Generic_Config_Update_and_Rollback_Design.md#31141-jsonchange). | |other errors | Check [SONiC_Generic_Config_Update_and_Rollback_Design](SONiC_Generic_Config_Update_and_Rollback_Design.md#31141-apply-change) for exact list of errors to expect. |side-effects|updating running-config| This operation will cause changes to the running-config according to the input JsonChange. -|assumptions |running-config locked| The implementor of this contract will interact with ConfigDB to updating the running-config, it is assumed the running-config is locked for changes for the lifespan of the operation. +|assumptions |running-config locked| The implementer of this contract will interact with ConfigDB to updating the running-config, it is assumed the running-config is locked for changes for the lifespan of the operation. The important constraint in the above interface is the JsonChange, where ordering of applying the modifications is arbitrary. @@ -224,7 +224,7 @@ Or it can be updated using the following steps: { "op": "add", "path": "/DHCP_SERVER/192.0.0.3", "value": {} }, ``` -Applying JsonChange is all about the running config being converted to the target config, the steps to update can be arbitrary and are decided by the implementor of the `apply-change` interface. +Applying JsonChange is all about the running config being converted to the target config, the steps to update can be arbitrary and are decided by the implementer of the `apply-change` interface. Our current design documents will use the "Table" as the main granular element of update. We will update "Table" by "Table" in alphabetical order. Each table update will take care of updating table entries in ConfigDB, restarting services if needed and verifying services have absorbed diff --git a/doc/config-generic-update-rollback/Json_Patch_Ordering_using_YANG_Models_Design.md b/doc/config-generic-update-rollback/Json_Patch_Ordering_using_YANG_Models_Design.md index 926d06725d6..964eacb7b20 100644 --- a/doc/config-generic-update-rollback/Json_Patch_Ordering_using_YANG_Models_Design.md +++ b/doc/config-generic-update-rollback/Json_Patch_Ordering_using_YANG_Models_Design.md @@ -119,7 +119,7 @@ list order-patch(JsonPatch jsonPatch) |errors |malformedPatchError | Will be raised if the input JsonPatch is not valid according to [JSON Patch (RFC6902)](https://tools.ietf.org/html/rfc6902). | |other errors | Check [SONiC Generic Configuration Update and Rollback - HLD](SONiC_Generic_Config_Update_and_Rollback_Design.md#3114-patch-orderer) for exact list of errors to expect. |side-effects|None | -|assumptions |running-config locked| The implementor of this contract might interact with ConfigDB to get the running-config, it is assumed the running-config is locked for changes for the lifespan of the operation. +|assumptions |running-config locked| The implementer of this contract might interact with ConfigDB to get the running-config, it is assumed the running-config is locked for changes for the lifespan of the operation. The important constraint in the above interface is the JsonChange, where the application of a single JsonChange does not guarantee any ordering. If ordering is needed, multiple JsonChanges should be returned. diff --git a/doc/config-generic-update-rollback/SONiC_Generic_Config_Update_and_Rollback_Design.md b/doc/config-generic-update-rollback/SONiC_Generic_Config_Update_and_Rollback_Design.md index 18e200cbf9b..cc3afa9ec10 100644 --- a/doc/config-generic-update-rollback/SONiC_Generic_Config_Update_and_Rollback_Design.md +++ b/doc/config-generic-update-rollback/SONiC_Generic_Config_Update_and_Rollback_Design.md @@ -403,7 +403,7 @@ Here is a summary explaining the `order-patch` contract, Check [3.1.1.4 Patch Or |errors |malformedPatchError | Will be raised if the input JsonPatch is not valid according to [JSON Patch (RFC6902)](https://tools.ietf.org/html/rfc6902). | |other errors | Check [3.1.1.4.2 Order-Patch](#31142-order-patch) for exact list of errors to expect. |side-effects|None | -|assumptions |running-config locked| The implementor of this contract might interact with ConfigDB to get the running-config, it is assumed the running-config is locked for changes for the lifespan of the operation. +|assumptions |running-config locked| The implementer of this contract might interact with ConfigDB to get the running-config, it is assumed the running-config is locked for changes for the lifespan of the operation. #### Stage-3 Applying list of JsonChanges in order There are a few SONiC applications which store their configuration in the ConfigDB. These applications do not subscribe to the ConfigDB change events. So any changes to their corresponding table entries as part of the patch apply process in the ConfigDB are not processed by the application immediately. In order to apply the configuration changes, corresponding service needs to be restarted. Listed below are some example tables from SONiC config, and the corresponding services that need to be manually restarted. @@ -444,7 +444,7 @@ Here is a summary explaining the `apply-change` contract, Check [3.1.1.4 Change |errors |malformedChangeError | Will be raised if the input JsonChange is not valid according to [3.1.1.4.1 JsonChange](#31141-jsonchange). | |other errors | Check [3.1.1.4.1 apply-change](#31141-apply-change) for exact list of errors to expect. |side-effects|updating running-config| This operation will cause changes to the running-config according to the input JsonChange. -|assumptions |running-config locked| The implementor of this contract will interact with ConfigDB to updating the running-config, it is assumed the running-config is locked for changes for the lifespan of the operation. +|assumptions |running-config locked| The implementer of this contract will interact with ConfigDB to updating the running-config, it is assumed the running-config is locked for changes for the lifespan of the operation. #### Stage-4 Post-update validation The expectations after applying the JsonPatch is that it will adhere to [RFC 6902](https://tools.ietf.org/html/rfc6902). @@ -606,11 +606,11 @@ The only condition of JsonChange is that the final outcome after applying the wh | |conflictingStateError | Will be raised if the patch cannot be applied to the current state of the running config e.g. trying to add an item to a non-existing json dictionary. | |internalError | Will be raised if any other error is encountered that's different than the ones listed above. |side-effects|None | -|assumptions |running-config locked | The implementor of this contract might interact with ConfigDB to get the running-config, it is assumed the ConfigDB is locked for changes for the lifespan of the operation. +|assumptions |running-config locked | The implementer of this contract might interact with ConfigDB to get the running-config, it is assumed the ConfigDB is locked for changes for the lifespan of the operation. If `order-patch` has to force the update to follow very specific steps, it would have to provide multiple JsonChange objects in the return list of `order-patch`. -`order-patch` is returning a list of JsonChanges instead of a simple JsonPatch with multiple operations because a JsonChange can group together multiple JsonPatch operations that share no dependency and can be executed together. This can help the implementor of `apply-change` to optimize the mechanism for applying JsonChange e.g. group changes under same parent together or reduce number of service restarts. +`order-patch` is returning a list of JsonChanges instead of a simple JsonPatch with multiple operations because a JsonChange can group together multiple JsonPatch operations that share no dependency and can be executed together. This can help the implementer of `apply-change` to optimize the mechanism for applying JsonChange e.g. group changes under same parent together or reduce number of service restarts. For example: Assume JsonPatch contains: @@ -632,7 +632,7 @@ We have 2 operations updating DHCP servers, and another operation for DEVICE_NEI ] ``` Updating DHCP_SERVERS requires restarting `dhcp_relay` service, so if the above patch is to be executed in order, we will restart `dhcp_relay` service twice. -But since the implementor of `apply-change` can order the operations in any way they see fit since they are OK to update together. They can decide move the DHCP updates together, and DHCP table twice, but restart `dhcp_relay` service only once. +But since the implementer of `apply-change` can order the operations in any way they see fit since they are OK to update together. They can decide move the DHCP updates together, and DHCP table twice, but restart `dhcp_relay` service only once. Let's take a visual example, assume we have a JsonPatch with 8 operations, and here is the topological order of the operation. Arrow from op-x to op-y means op-y depends on op-x. @@ -646,7 +646,7 @@ But if we organize the operations into groups of JsonChange, we will have: jsonchanges-order -This will allow the the implementor of `apply-change` to have the freedom to optimize the operations in any order they see fit. +This will allow the the implementer of `apply-change` to have the freedom to optimize the operations in any order they see fit. **NOTE:** Check Patch Orderer implementation design design details in [Json_Patch_Ordering_using_YANG_Models_Design](Json_Patch_Ordering_using_YANG_Models_Design.md) document. @@ -668,9 +668,9 @@ void apply-change(JsonChange jsonChange) | |unprocessableRequestError| Will be raised if the change is valid, all the resources are found but when applying the change it causes an error in the system. | |internalError | Will be raised if any other error is encountered that's different than the ones listed above. |side-effects|updating running-config | This operation will cause changes to the running-config according to the input JsonChange. -|assumptions |running-config locked | The implementor of this contract will interact with ConfigDB to updating the running-config, it is assumed the ConfigDB is locked for changes for the lifespan of the operation. +|assumptions |running-config locked | The implementer of this contract will interact with ConfigDB to updating the running-config, it is assumed the ConfigDB is locked for changes for the lifespan of the operation. -Since the order of executing the operation does not matter, the implementor of this component can work on optimizing the time to run the operation. For details check [3.1.1.4 Change Applier](#3115-change-applier). +Since the order of executing the operation does not matter, the implementer of this component can work on optimizing the time to run the operation. For details check [3.1.1.4 Change Applier](#3115-change-applier). **NOTE:** Check Change Applier implementation design design details in [Json_Change_Application_Design](Json_Change_Application_Design.md) document. @@ -695,7 +695,7 @@ Same as [3.1.1.3 YANG models](#3113-yang-models) same as [3.1.1.5 ConfigDB](#3115-configdb) #### 3.1.2.5 File system -This will the file system where SONiC os is setup. Some suggestions prefer the path `/var/sonic/checkpoints`, but that is not decided yet. Will leave it to the implementor of this design document to decide. +This will the file system where SONiC os is setup. Some suggestions prefer the path `/var/sonic/checkpoints`, but that is not decided yet. Will leave it to the implementer of this design document to decide. ### 3.1.3 Rollback rollback-design @@ -736,7 +736,7 @@ same as [3.1.1.5 ConfigDB](#3115-configdb) #### 3.2.1.1 JsonPatch -The JsonPatch consistes of a list operation, and each operation follows this format: +The JsonPatch consists of a list operation, and each operation follows this format: ``` { "op": "", "path": "", "value": "", "from": "" } ``` diff --git a/doc/config_reload/config_reload_enhancement.md b/doc/config_reload/config_reload_enhancement.md index a58fe280593..b6dc2bc29ae 100644 --- a/doc/config_reload/config_reload_enhancement.md +++ b/doc/config_reload/config_reload_enhancement.md @@ -116,6 +116,6 @@ Hostcfgd tests would be enhanced to cover the new flow. - Ensure the delayed services are started as soon as PortInitDone is seen in APPL_DB table. Example services includes snmp, lldp and telemetry. #### System tests -There are existing SONiC mgmt tests to cover config reload scenario. After this feature the existing tests should run without degradation. The only noticable differentiation is the switch would be initialized faster in the new flow compared to the existing flow. +There are existing SONiC mgmt tests to cover config reload scenario. After this feature the existing tests should run without degradation. The only noticeable differentiation is the switch would be initialized faster in the new flow compared to the existing flow. diff --git a/doc/config_yang_validation/config_db_yang_validation.md b/doc/config_yang_validation/config_db_yang_validation.md index 121783afde4..744ddfede1c 100644 --- a/doc/config_yang_validation/config_db_yang_validation.md +++ b/doc/config_yang_validation/config_db_yang_validation.md @@ -94,7 +94,7 @@ However, these specs for validation are also defined in the sonic-portchannel YA For PortChannel and many of the config fields listed above, the field validation is unnecessarily duplicated in sonic-utilities/config and in YANG models. The goal of this project is to utilize preexisting YANG models to validate config CLI field updates, without unnecessarily defining separate, redundant validation specs ad-hoc in sonic-utilities. -Some necessary ad-hoc validations are not yet reflected in YANG models. For those missing validations that YANG infrastructure supports, GitHub issues are created to track the progress to filling in these YANG model gaps. In the uncommon scenario where YANG infrastructure is incapable of suporting a certain check, ad-hoc validation will be left in-place. +Some necessary ad-hoc validations are not yet reflected in YANG models. For those missing validations that YANG infrastructure supports, GitHub issues are created to track the progress to filling in these YANG model gaps. In the uncommon scenario where YANG infrastructure is incapable of supporting a certain check, ad-hoc validation will be left in-place. During the migration process, we will first leave the ad-hoc validation code in-place, and leave an option to configure type of validation used: ad-hoc or YANG validation. Once YANG validation has stabilized and YANG validation coverage has widened, we will remove the preexisting ad-hoc validation code for non performance-sensitive scenarios. diff --git a/doc/console/Portable-Console-Device-High-Level-Design.md b/doc/console/Portable-Console-Device-High-Level-Design.md index 395b6e8dd1d..b9e68c0831d 100644 --- a/doc/console/Portable-Console-Device-High-Level-Design.md +++ b/doc/console/Portable-Console-Device-High-Level-Design.md @@ -127,7 +127,7 @@ class PortableConsoleDeviceBase: @classmethod def is_plugged_in(cls): """ - Retrives whether portable console device is plugged in or not. + Retrieves whether portable console device is plugged in or not. This method is mandatory for factory function to auto detect vendor name and model name. :return: A boolean, True if portable console device is plugged in @@ -138,7 +138,7 @@ class PortableConsoleDeviceBase: @classmethod def get_vendor_name(cls): """ - Retrives the vendor name of the `PortableConsoleDeviceBase` concrete subclass. + Retrieves the vendor name of the `PortableConsoleDeviceBase` concrete subclass. This method is mandatory for factory function to create instance from manual configuration. :return: A string, denoting vendor name of the `PortableConsoleDeviceBase` concrete subclass. @@ -148,7 +148,7 @@ class PortableConsoleDeviceBase: @classmethod def get_model_name(cls): """ - Retrives the model name of the `PortableConsoleDeviceBase` concrete subclass. + Retrieves the model name of the `PortableConsoleDeviceBase` concrete subclass. This method is mandatory for factory function to create instance from manual configuration. :return: A string, denoting model name of the `PortableConsoleDeviceBase` concrete subclass. @@ -201,7 +201,7 @@ class PortableConsoleDeviceBase: def get_all_lines(self): """ - Retrieves the infomation of all console lines on portable console devices. + Retrieves the information of all console lines on portable console devices. :return: A dict, the key is console line number (integer, 1-based), the value is an object derived from `sonic_console.line_info.ConsoleLineInfo`. @@ -395,7 +395,7 @@ The flow chart below describes how `get_portable_console_device` function works: ![factory-function-flow-chart.png](./Portable-Console-Device-High-Level-Design/factory-function.png) -As mentioned above, only the third way is our recommendation, which can automatically detect which vendor's device is plugged in and create the corresponing object. The first and second ways are reserved for more flexibility, so they are given higher priority. +As mentioned above, only the third way is our recommendation, which can automatically detect which vendor's device is plugged in and create the corresponding object. The first and second ways are reserved for more flexibility, so they are given higher priority. ## SONiC CLI Design diff --git a/doc/console/SONiC-Console-Switch-High-Level-Design.md b/doc/console/SONiC-Console-Switch-High-Level-Design.md index 179b02471c6..0517cee1c07 100644 --- a/doc/console/SONiC-Console-Switch-High-Level-Design.md +++ b/doc/console/SONiC-Console-Switch-High-Level-Design.md @@ -110,16 +110,16 @@ This document describes the functionality and high level design of the Console s | Term | Meaning |---|---| -| SSH | **S**ecure **S**hell | +| SSH | **S**ecure **Sh**ell | | CLI | **C**ommand **L**ine **I**nterface | | OS | **O**perating **S**ystem | | USB | **U**niversal **S**erial **B**us | | TTY | **T**ele**TY**pewriter, terminal for text input/output environment | -| PID | **P**rocess **ID**entification number | +| PID | **P**process **ID**entification number | | ETH | **ETH**ernet | | UTC | **C**oordinated **U**niversal **T**ime | | MGMT | **M**ana**G**e**M**en**T** | -| TCP | **T**ransmission **C**ontrol **P**rotocol | +| TCP | **T**ransmission **C**control **P**rotocol | # 1 Feature Overview @@ -269,8 +269,8 @@ ssh tom@2001:db8::1 # connect to DeviceB ssh tom@2001:db8::2 -# Assume that domain name DeviceC.co point to 2001:db8::3 -ssh tom@DeviceC.co +# Assume that domain name devicec.co point to 2001:db8::3 +ssh tom@devicec.co ``` The mechanism behind it is actually very similar to mode A. We will record the target ssh host IP address to a environment variable `$SSH_TARGET_IP`. Since we have stored the relationship between line number and it's management IP, then we can easily start the management session automatically after user login by calling `consutil connect` command in `/etc/bash.bashrc`. If the management IP were not found in config db (consutil connect failed due to target not found), then we will fall back to normal SONiC management bash session. diff --git a/doc/console/serial-console-HLD.md b/doc/console/serial-console-HLD.md index 23cb15eaf2f..7368ddf7009 100644 --- a/doc/console/serial-console-HLD.md +++ b/doc/console/serial-console-HLD.md @@ -65,12 +65,12 @@ We want to enhance configDB to include table for serial-console global configura We want to enable serial-console configuration in SONIC. In order to do so will touch few areas in the system: 1. configDB - to include a dedicated table for configurations 2. hostcfg demon - to trigger dedicated service on config apply. -3. OS config files - specific for this stage we are only /etc/profile.d/tmout-env.sh and /etc/sysctl.d/95-sysrq-sysctl.conf and /proc/sys/kernel/sysrq are going to be modifed by the serial-config.sh runned by serial-config.service . +3. OS config files - specific for this stage we are only /etc/profile.d/tmout-env.sh and /etc/sysctl.d/95-sysrq-sysctl.conf and /proc/sys/kernel/sysrq are going to be modified by the serial-config.sh run by serial-config.service . ##### Flow diagram ![serial_console_flow](serial_console_flow.png) ### 3.1 Flow description -When the feature is enabled, by modifying the DB manually, user will set serial-console configurations by modifing CONFIG_DB in SERIAL_CONSOLE table. +When the feature is enabled, by modifying the DB manually, user will set serial-console configurations by modifying CONFIG_DB in SERIAL_CONSOLE table. The hostcfgd daemon will be extended to listen to configurations from SERIAL_CONSOLE table and restarts the serial_console.service. Serial console script will read SERIAL_CONSOLE table and update config files accordingly. @@ -216,10 +216,10 @@ Example sub-sections for unit test cases and system test cases are given below. Configuration 1. Configure auto-logout for serial-console. 1.1. Configure and apply non-default auto-logout value (1-2 min.) -1.2. Connect and login via serial-console. Validate auto-logout happend in configured time (1-2 min.) +1.2. Connect and login via serial-console. Validate auto-logout happened in configured time (1-2 min.) 2. Init flow for auto-logout. 2.1. Don't save previous auto-logout configuration and reboot the switch. -2.2. After boot connect and login via serial-console. Validate that auto-logout didn't happend in previously configured time (1-2 min.) +2.2. After boot connect and login via serial-console. Validate that auto-logout didn't happen in previously configured time (1-2 min.) 3. Configure sysrq parameter. 3.1. Configure and apply non-default sysrq-capabilities parameter (enabled) 3.2. Check sysrq parameter value in linux proc filesystem being changed to new applied value of "1" diff --git a/doc/copp/Copp_Neighbor_Miss_Trap_And_Enhancements.md b/doc/copp/Copp_Neighbor_Miss_Trap_And_Enhancements.md index 0bf5a9567e3..6fb4d21dad0 100755 --- a/doc/copp/Copp_Neighbor_Miss_Trap_And_Enhancements.md +++ b/doc/copp/Copp_Neighbor_Miss_Trap_And_Enhancements.md @@ -283,7 +283,7 @@ No warmboot and fastboot impact is expected for this feature. #### 12.1.1. SWSS Unit Test Cases * **Neighbor Miss default configuration verification:** Default copp group and trap configuration of neighbor miss will be added to test_copp.py to verify the default configuration. -* **Trap hw_status verification:** Existing test cases currently verify trap configuration by SET/DEL on CONFIG_DB and vaidating the configuration by GET on ASIC_DB. These test cases will be extended to also verify the hw_status field in STATE_DB. +* **Trap hw_status verification:** Existing test cases currently verify trap configuration by SET/DEL on CONFIG_DB and validating the configuration by GET on ASIC_DB. These test cases will be extended to also verify the hw_status field in STATE_DB. * **STATE_DB capability table verification:** New test cases will be added to verify the COPP_TRAP_CAPABILITY_TABLE table in STATE_DB. Test case will perform GET operation and verify trap_id_list field in the table is not empty. #### 12.1.2. CLI UT Test Cases diff --git a/doc/dash/dash-sonic-hld.md b/doc/dash/dash-sonic-hld.md index f6a75b4a0e2..66c5a36fb6d 100644 --- a/doc/dash/dash-sonic-hld.md +++ b/doc/dash/dash-sonic-hld.md @@ -728,7 +728,7 @@ metering_class_or = uint32 DASH_TUNNEL_TABLE shall have one or more endpoints. Encap type, VNI are create only attributes. A change on encap would require deleting and creating new tunnel objects. One endpoint is treated as single nexthop and comma separated multiple endpoints shall be treated as ECMP nexthop. For return packet from the tunnel, expectation is to have the same encap type. -For single endpoint, implmentation shall simply create a sai_dash_tunnel object with ```SAI_DASH_TUNNEL_ATTR_DIP=endpoint IP``` and ```SAI_DASH_TUNNEL_ATTR_MAX_MEMBER_SIZE=1``` +For single endpoint, implementation shall simply create a sai_dash_tunnel object with ```SAI_DASH_TUNNEL_ATTR_DIP=endpoint IP``` and ```SAI_DASH_TUNNEL_ATTR_MAX_MEMBER_SIZE=1``` For ECMP, implementation shall create ```sai_dash_tunnel_member``` and ```sai_dash_tunnel_next_hop``` with appropriate ```SAI_DASH_TUNNEL_ATTR_MAX_MEMBER_SIZE```. Since MAX_MEMBER_SIZE is set during creation, it is expected that adding new member will be a new DASH_TUNNEL object creation. However, implementation shall support removing members. @@ -1310,7 +1310,7 @@ Refer DASH documentation for the test plan. "DASH_METER_RULE: { "meter_policy_id": "245bea34-1000-0000-0000-0000082764ac", "rule_num": "1", - "prioirty": "0", + "priority": "0", "ip_prefix": "40.0.0.1/32", "metering_class":"20000" }, diff --git a/doc/database/multi_database_instances.md b/doc/database/multi_database_instances.md index c4858d4dca3..55eb4949911 100644 --- a/doc/database/multi_database_instances.md +++ b/doc/database/multi_database_instances.md @@ -30,7 +30,7 @@ DUT try to load a new images * We introduce a new configuration file. -* This file contains how many redis instances and databases , also the configration of each database , including instance, dbid, separator. +* This file contains how many redis instances and databases , also the configuration of each database , including instance, dbid, separator. ```json { @@ -94,7 +94,7 @@ DUT try to load a new images * By default, each image has one default startup database\_config.json file in SONiC file system at /etc/default/sonic-db/. -* The users is able to use the customized database configration, what needs to do is creating a database\_config.josn file and place it at /etc/sonic/ +* The users is able to use the customized database configuration, what needs to do is creating a database\_config.josn file and place it at /etc/sonic/ * We changed the database Docker ENTRYPOINT to docker-database-init.sh which is new added. @@ -108,8 +108,8 @@ Detail steps as below: * [x] if no folder /host/old\_config/, copy some default xmls and etc. as usual 3. **database service** * [x] **database docker start, entrypoint docker-database-init.sh** - * [x] **if database\_config.json is found at /ect/sonic/, that means there is customized database config, we copy this config file to /var/run/redis/sonic-db/, which is the running database config file location, all the applications will read databse information from this file** - * [x] **if database\_config.json is NOT found at /ect/sonic/, that means there is no customized database config, we copy this config file at /etc/default/ to /var/run/redis/sonic-db/, this is the default startuo config in the image itself.** + * [x] **if database\_config.json is found at /etc/sonic/, that means there is customized database config, we copy this config file to /var/run/redis/sonic-db/, which is the running database config file location, all the applications will read database information from this file** + * [x] **if database\_config.json is NOT found at /etc/sonic/, that means there is no customized database config, we copy this config file at /etc/default/ to /var/run/redis/sonic-db/, this is the default startuo config in the image itself.** * [x] **using supervisord.conf.j2 to generate supervisord.conf** * [x] **execute the previous entrypoint program /usr/bin/supervisord, then all the services will start based on the new supervisord.conf, which including starting how many redis instances** * [x] **check if redis instances are running or NOT via ping_pong_db_insts script** @@ -258,10 +258,10 @@ public: static constexpr const char *DEFAULT_UNIXSOCKET = "/var/run/redis/redis.sock"; /* - * Connect to Redis DB wither with a hostname:port or unix socket + * Connect to Redis DB either with a hostname:port or unix socket * Select the database index provided by "db" * - * Timeout - The time in milisecond until exception is been thrown. For + * Timeout - The time in millisecond until exception is been thrown. For * infinite wait, set this value to 0 */ DBConnector(int dbId, const std::string &hostname, int port, unsigned int timeout); @@ -336,7 +336,7 @@ Today the usage is to accept parameter in SonicV2Connector()->init() and then ca The new design is similar to what we did for C++. We introduce a new class SonicDBConfig which is used to read database\_config.json file and store the database configuration information. -Then we modify the existing class SonicV2Connector, we use SonicDBConfig to get the database inforamtion in SonicV2Connector before we connect the redis instances. +Then we modify the existing class SonicV2Connector, we use SonicDBConfig to get the database information in SonicV2Connector before we connect the redis instances. interface.py @@ -736,7 +736,7 @@ The scripts is used in shell, python, c and c++ system call, we need to change a We added a new sonic-db-cli which is written in python, the function is the same as redis-cli, the only difference is to accept db name as the first parameter instead of '-n x' for redis-cli. -Form the db name, we can using exising python swsssdk library to look up the db information and use them. This new sonic-db-cli is in swsssdk as well and will be installed where ever swsssdk is installed. +Form the db name, we can using existing python swsssdk library to look up the db information and use them. This new sonic-db-cli is in swsssdk as well and will be installed where ever swsssdk is installed. swsssdk/src/script/sonic-db-cli ```python @@ -836,9 +836,9 @@ Then restore it when database instance is up. The redis instances & databases mapping may change between versions, that means before warmboot, we may have, for example, 3 redis instances and after warmboot, the number of redis instances may be, for example, 2 or 4. This makes the original 3 saved rdb files CANNOT restore the 2 or 4 redis instances directly. We need to do something to restore data based on the new redis instances & database mapping. -Today we already assigned each database name with a unique number (APPL_DB 0, ASIC_DB 1, ...) and assign them into different redis instances in design. This makes it possible to migrate all data in all redis instances into one redis instance without any conflicts. Then we can handle this single redis instance the same as what we did today, since today we are only use single redis instance. So the poposed new idea is as below steps: +Today we already assigned each database name with a unique number (APPL_DB 0, ASIC_DB 1, ...) and assign them into different redis instances in design. This makes it possible to migrate all data in all redis instances into one redis instance without any conflicts. Then we can handle this single redis instance the same as what we did today, since today we are only use single redis instance. So the proposed new idea is as below steps: -1. When we normally start multiple redis instances, besides all the necessary and used redis isntances, we add one more unused/spare redis instance configration in database_config.json. This unused/spare redis instance is not accessed by any application and it is empty. +1. When we normally start multiple redis instances, besides all the necessary and used redis instances, we add one more unused/spare redis instance configuration in database_config.json. This unused/spare redis instance is not accessed by any application and it is empty. 2. In the shutdown stage of warmboot, we migrate the data on all used redis instances into the unused/spare/empty redis instance. In this way all the data are in one redis instance and we can issue "redis-cli save" now to generate a single rdb file containing all the data. After this point, everything is the same as what we did today for single redis instance, copying rdb file to WARM_DIR and so on. 3. When loading new images during warmboot, we copy the single full data rdb file to each instance's rdb file path as what we did today. Then all the instances have the full data after startup, each instance can access the data as usual based on the redis instances & databases mapping in database_config.json. The unused data could be deleted via "redis-cli flushdb". 4. At this point, all the data are restored based on the new redis instance & database mapping. In this way, we don't need to find the delta between configurations and could handle all redis instances&database mapping cases. @@ -846,8 +846,8 @@ Today we already assigned each database name with a unique number (APPL_DB 0, AS Below shows an example: - [x] Before warmboot, there are four redis instances and the mapping as shown. -- [x] During warmboot, migrating all data into one redis intance and save rdb file. -- [x] In new image, there are only three instances and after startup, each instances has full data. But ins0 only use DB0&DB1 based on configration in database_config.json, DB2&DB3 are never used and we can flushdb them. +- [x] During warmboot, migrating all data into one redis instance and save rdb file. +- [x] In new image, there are only three instances and after startup, each instances has full data. But ins0 only use DB0&DB1 based on configuration in database_config.json, DB2&DB3 are never used and we can flushdb them. ![center](./img/db_restore_new.png) @@ -1000,7 +1000,7 @@ Now we see, the extra step for the new implementation is migrating all data into db6:keys=365,expires=0,avg_ttl=0 ``` -So this method is good for warmboot database backup, we just need to add above python script to merge all data into one redis isntance and save data. The other changes are minor like copying files .... +So this method is good for warmboot database backup, we just need to add above python script to merge all data into one redis instance and save data. The other changes are minor like copying files .... ## Platform VS @@ -1010,7 +1010,7 @@ From the feedback in SONiC meeting, docker platform vs is suggested to have mult ## Other unit testing -SONiC has many unit testing runing when building images or after submmitting PR. Those test cases are not running under database docker environment, hence for those local test cases, we add some static database\_config.josn under each tests directory or isntalled via library(swss and swsssdk). These database\_config.json files are used for testing only. +SONiC has many unit testing running when building images or after submmitting PR. Those test cases are not running under database docker environment, hence for those local test cases, we add some static database\_config.josn under each tests directory or installed via library(swss and swsssdk). These database\_config.json files are used for testing only. ## DUT Testing diff --git a/doc/database/multi_namespace_db_instances.md b/doc/database/multi_namespace_db_instances.md index 7d129b851f5..12b4b8a3221 100644 --- a/doc/database/multi_namespace_db_instances.md +++ b/doc/database/multi_namespace_db_instances.md @@ -11,7 +11,7 @@ In the Multi NPU devices, the services could be broadly classified into * Global services like database, snmp, pmon, telemetry running in the dockers running in the linux "host". We call it "global" namespace. -* NPU specific services like database, swss, syncd, bgp, teamd, lldp etc which runs in a separate "namespace" created. The method used currently to seggregate services per NPU is "linux network namespace" and there is a one-to-one mapping between the number of NPUs and linux network namesapces created. We call it "NPU" namespace. +* NPU specific services like database, swss, syncd, bgp, teamd, lldp etc which runs in a separate "namespace" created. The method used currently to seggregate services per NPU is "linux network namespace" and there is a one-to-one mapping between the number of NPUs and linux network namespaces created. We call it "NPU" namespace. The database docker in the "global" namespace can be called the "global DB" service. The redis databases available here (decided by contents of database_config.json) would be APPL_DB, CONFIG_DB used to store the system wide attributes like AAA, syslog, ASIC to interface name mapping etc. @@ -31,7 +31,7 @@ The database service for a NPU linux namespace {NS} will use "/var/run/redis_{NS Following are the major design changes -* A new file **database_global.json** is introduced. It will contain the details of all the namespaces and the corresponsing database_config.json files. This file would be created by the "globalDB" service in the directory "/var/run/redis/sonic-db/". In the below example, we consider a SONIC device with 3 NPUS's and hence have 3 namespaces referred as "asic0", "asic1", "asic2". The first entry refers to the database_config.json file used by database docker running in linux host. +* A new file **database_global.json** is introduced. It will contain the details of all the namespaces and the corresponding database_config.json files. This file would be created by the "globalDB" service in the directory "/var/run/redis/sonic-db/". In the below example, we consider a SONIC device with 3 NPUS's and hence have 3 namespaces referred as "asic0", "asic1", "asic2". The first entry refers to the database_config.json file used by database docker running in linux host. ```json { @@ -131,7 +131,7 @@ Following are the major design changes the linux host namespace, the NS_REF_CNT will be equal to the number of namespaces in the device. Currently we have a NPU:namespace mapping of 1:1, hence we pass the NS_REF_CNT to be the number of NPU's. - Additional variables to be introduced in future to make this more flexible like creating more redis INSTANCES, assosiating DATABASES to different redis instances etc. + Additional variables to be introduced in future to make this more flexible like creating more redis INSTANCES, associating DATABASES to different redis instances etc. **database_global.json** @@ -250,10 +250,10 @@ class SonicDBConfig(object): _sonic_db_config = {} """This is the database_global.json parse and load API. This file has the namespace name and - the corresponsing database_config.json file. The global file is significant for the + the corresponding database_config.json file. The global file is significant for the applications running in the linux host namespace, like eg: config/show cli, snmp etc which needs to connect to databases running in other namesacpes. If the "namespace" attribute is not - specified for an "include" attribute, it referes to the linux host namespace. + specified for an "include" attribute, it refers to the linux host namespace. """ @staticmethod def load_sonic_global_db_config(global_db_file_path=SONIC_DB_GLOBAL_CONFIG_FILE): @@ -306,7 +306,7 @@ class SonicDBConfig(object): logger.warning(msg) sonic_db_file_path = os.path.join(os.path.dirname(os.path.abspath(__file__)), 'config', 'database_config.json') with open(sonic_db_file_path, "r") as read_file: - # The database_config.json is loaded into the '' index, which referes to the local namespace. + # The database_config.json is loaded into the '' index, which refers to the local namespace. SonicDBConfig._sonic_db_config[''] = json.load(read_file) except (OSError, IOError): msg = "Could not open sonic database config file '{}'".format(sonic_db_file_path) @@ -353,8 +353,8 @@ class SonicV2Connector(DBInterface): self.namespace = namespace """If the user don't give the namespace as input, it takes the default value of '' - '' (empty string) referes to the local namespace where this class is used. - (It could be a network namespace or linux host namesapce) + '' (empty string) refers to the local namespace where this class is used. + (It could be a network namespace or linux host namespace) """ if not isinstance(namespace, str): msg = "{} is not a valid namespace name".format(namespace) diff --git a/doc/debug_framework_design_spec.md b/doc/debug_framework_design_spec.md index 7567e26d997..76fa0a2eb9b 100644 --- a/doc/debug_framework_design_spec.md +++ b/doc/debug_framework_design_spec.md @@ -207,7 +207,7 @@ DebugDumpOrch show commands will act as triggers for OrchAgent. Syntax: `show debug ... ` -Definition of command and required options are left to the component module owners. Sample CLI section descripes few examples on how the "show debug" CLI command can be used. +Definition of command and required options are left to the component module owners. Sample CLI section describes few examples on how the "show debug" CLI command can be used. ##### 3.1.1.1.3 Sample CLI *RouteOrch:* @@ -302,7 +302,7 @@ Debug framework will initialize with below default parameters and shall be consi ### 3.1.2 Assert Framework #### 3.1.2.1 Overview -Asserts are added in the program execution sequence to confirm that the data/state at a certain point is valid/true. During developement, if the programming sequence fails in an assert condition then the program execution is stopped by crash/exception. In production code, asserts are normally removed. This framework enhances/extendes the assert to provide more debug details when an assert fails. +Asserts are added in the program execution sequence to confirm that the data/state at a certain point is valid/true. During development, if the programming sequence fails in an assert condition then the program execution is stopped by crash/exception. In production code, asserts are normally removed. This framework enhances/extendes the assert to provide more debug details when an assert fails. Classify assert failure conditions based on following types, assert() will have type and the module as additional arguments - DUMP: Invokes the debug framework registered callback API corresponding to the module @@ -311,7 +311,7 @@ Classify assert failure conditions based on following types, assert() will have - ABORT: Stop/throw exception -#### 3.1.2.2 PsuedoCode: +#### 3.1.2.2 PseudoCode: ``` static void custom_assert(bool exp, const char*func, const unsigned line); diff --git a/doc/dhcp_server/port_based_dhcp_server_high_level_design.md b/doc/dhcp_server/port_based_dhcp_server_high_level_design.md index 9f5a1c5ef97..78fae4678cf 100644 --- a/doc/dhcp_server/port_based_dhcp_server_high_level_design.md +++ b/doc/dhcp_server/port_based_dhcp_server_high_level_design.md @@ -119,7 +119,7 @@ For broadcast packet (discover, request) sent by client, obviously it would be r
-Belows are sample configurations for dhcrelay and kea-dhcp-server: +Below are sample configurations for dhcrelay and kea-dhcp-server: - dhcprelay: ```CMD @@ -257,7 +257,7 @@ Have to be aware of is that below options are not supported to customize, becaus Currently support binary, boolean, string, ipv4-address, uint8, uint16, uint32. ## DHCP Relay Daemon -For scenario of dhcp_server feature is enabled, we need a daemon process inside dhcp_relay container to manage dhcrelay processes. dhcprelayd would subcribe VLAN/VLAN_MEMBER/DHCP_SERVER_IPV4* table in config_db, and when dhcp_relay container restart or related config changed, dhcprelayd will kill/start/restart dhcrelay process. +For scenario of dhcp_server feature is enabled, we need a daemon process inside dhcp_relay container to manage dhcrelay processes. dhcprelayd would subscribe VLAN/VLAN_MEMBER/DHCP_SERVER_IPV4* table in config_db, and when dhcp_relay container restart or related config changed, dhcprelayd will kill/start/restart dhcrelay process.
@@ -787,7 +787,7 @@ This command is used to unbind dhcp option. config dhcp_server ipv4 option unbind (--all | ) ``` -- Exampe +- Example ``` config dhcp_server ipv4 option unbind Vlan1000 --all diff --git a/doc/drop_counters/drop_counters_HLD.md b/doc/drop_counters/drop_counters_HLD.md index ea8c930102e..c3ff3d11575 100644 --- a/doc/drop_counters/drop_counters_HLD.md +++ b/doc/drop_counters/drop_counters_HLD.md @@ -9,7 +9,7 @@ * [Revision](#revision) * [About this Manual](#about-this-manual) * [Scope](#scope) -* [Defintions/Abbreviation](#definitionsabbreviation) +* [Definitions/Abbreviation](#definitionsabbreviation) * [1 Overview](#1-overview) - [1.1 Use Cases](#11-use-cases) - [1.1.1 A flexible "drop filter"](#111-a-flexible-"drop-filter") diff --git a/doc/dualtor/active_active_hld.md b/doc/dualtor/active_active_hld.md index 1795dcf4579..43b068dd91e 100644 --- a/doc/dualtor/active_active_hld.md +++ b/doc/dualtor/active_active_hld.md @@ -18,7 +18,7 @@ This document provides the high level design of SONiC dual toR solution, support ## Content [1 Cluster Topology](#active_active_hld.md#1-cluster-topology) -[2 Requrement Overview](#2-requrement-overview) +[2 Requirement Overview](#2-requrement-overview) - [2.1 Server Requirements](#21-server-requirements) - [2.2 SONiC Requirements](#22-sonic-requirements) @@ -38,7 +38,7 @@ This document provides the high level design of SONiC dual toR solution, support - [3.3.3 Forwarding State](#333-forwarding-state) - [3.3.4 Acitve-Active State Machine](#334-acitve-active-state-machine) - [3.3.5 Default route to T1](#335-default-route-to-t1) - - [3.3.6 Incremental Featrues](#336-incremental-featrues) + - [3.3.6 Incremental Features](#336-incremental-featrues) - [3.4 Orchagent](#34-orchagent) - [3.4.1 IPinIP tunnel](#341-ipinip-tunnel) - [3.4.2 Flow Diagram and Orch Components](#342-flow-diagram-and-orch-components) @@ -72,7 +72,7 @@ In this design: ![image info](./image/cluster_topology.png) -## 2 Requrement Overview +## 2 Requirement Overview ### 2.1 Server Requirements In our cluster setup, as smart y-cable is replaced, some complexity shall be transferred to server NIC. @@ -133,7 +133,7 @@ Note that, this complexity can be handled by active-active smart cables, or any 1. Introduce active-active mode into MUX state machine. 1. Probe to determine if link is healthy or not. 1. Signal NIC if ToR is switching active or standby. -1. Rescue when peer ToR failure occures. +1. Rescue when peer ToR failure occurs. 1. Unblock traffic when cable control channel is unreachable. ## 3 SONiC ToR Controlled Solution @@ -330,7 +330,7 @@ Linkmgrd will provide the determination of a ToR / link's readiness for use. ![grpc_failure](./image/gRPC_failure.png) #### 3.3.5 Default route to T1 - If default route to T1 is missing, dual ToR system can suffer from northbound packet loss, hence linkmgrd also monitors defaul route state. If default route is missing, linkmgrd will stop sending ICMP probing request and fake an unhealthy status. This functionality can be disabled as well, the details is included in [default_route](https://github.com/sonic-net/sonic-linkmgrd/blob/master/doc/default_route.md). + If default route to T1 is missing, dual ToR system can suffer from northbound packet loss, hence linkmgrd also monitors default route state. If default route is missing, linkmgrd will stop sending ICMP probing request and fake an unhealthy status. This functionality can be disabled as well, the details is included in [default_route](https://github.com/sonic-net/sonic-linkmgrd/blob/master/doc/default_route.md). To summarize the state transition decision we talk about, and the corresponding gRPC action to take, we have this decision table below: @@ -400,17 +400,17 @@ Linkmgrd will provide the determination of a ToR / link's readiness for use. -#### 3.3.6 Incremental Featrues +#### 3.3.6 Incremental Features -* Link Prober Packet Loss Statics - Link prober will by default send heartbeat packet every 100 ms, the packet loss statics can be a good measurement of system healthiness. An incremental feature is to collect the packet loss counts, start time and end time. The collected data is stored and updated in state db. User can check and reset through CLI. +* Link Prober Packet Loss Statistics + Link prober will by default send heartbeat packet every 100 ms, the packet loss statistics can be a good measurement of system healthiness. An incremental feature is to collect the packet loss counts, start time and end time. The collected data is stored and updated in state db. User can check and reset through CLI. -* Supoort for Detachment +* Support for Detachment User can config linkmgrd to a certain mode, so it won't switch to active / standby based on health indicators. User can also config linkmgrd to a mode, so it won't modify peer's forwarding state. This support will be useful for maintenance, upgrade and testing scenarios. ### 3.4 Orchagent #### 3.4.1 IPinIP tunnel -Orchagent will create tunnel at initialization and add / remove routes to forward traffic to peer ToR via this tunnel when linkmgrd switchs state to standby / active. +Orchagent will create tunnel at initialization and add / remove routes to forward traffic to peer ToR via this tunnel when linkmgrd switches state to standby / active. Check below for an example of config DB entry and tunnel utilization when LT0's link is having issue. ![tunnel](./image/tunnel.png) @@ -453,7 +453,7 @@ This optimization maintains the same traffic forwarding behavior while significa ### 3.5 Transceiver Daemon #### 3.5.1 Cable Control through gRPC - In active-active design, we will use gRPC to do cable control and signal NIC if ToRs is up active. SoC will run a gRPC server. Linkmgrd will determine server side forwarding state based on link prober status and link state. Then linkmgrd can invoke transceiver daemon to update NIC wether ToRs are active or not through gRPC calls. + In active-active design, we will use gRPC to do cable control and signal NIC if ToRs is up active. SoC will run a gRPC server. Linkmgrd will determine server side forwarding state based on link prober status and link state. Then linkmgrd can invoke transceiver daemon to update NIC whether ToRs are active or not through gRPC calls. Current defined gRPC services between SoC and ToRs related with linkmgrd cable controlling: * DualToRActive @@ -497,7 +497,7 @@ The following UML diagram shows this change when Linkmgrd state moves to standby #### 3.8.1 Advertise updated routes to T1 Current failover strategy can smoothly handle the link failure cases, but if one of the ToRs crashes, and if T1 still sends traffic to the crashed ToR, we will see packet loss. -A further improvement in rescuing scenario, is when detecting peer's unhealthy status, local ToR advertises specific routes (i.e. longer prefix), so that traffic from T1 does't go to crashed ToR as all. +A further improvement in rescuing scenario, is when detecting peer's unhealthy status, local ToR advertises specific routes (i.e. longer prefix), so that traffic from T1 doesn't go to crashed ToR as all. #### 3.8.2 Server Servicing & ToR Upgrade For server graceful restart, We already have gRPC service defined in [3.5.1](#351-cable-control-through-grpc). An indicator of ongoing server servicing should be defined based on that notification, so ToR can avoid upgrades in the meantime. Vice versa, we can also define gRPC APIs to notify server when ToR upgrade is ongoing. diff --git a/doc/dualtor/dualtor_active_standby_hld.md b/doc/dualtor/dualtor_active_standby_hld.md index 5b2ce1c5d17..f767899ecb3 100644 --- a/doc/dualtor/dualtor_active_standby_hld.md +++ b/doc/dualtor/dualtor_active_standby_hld.md @@ -91,7 +91,7 @@ Able to switch to a healthy link/ToR when there is a link failure. | | | timeout | treat loss of heartbeat if link probe not received for x intervals. | | | | suspend_timer | heartbeats will be suspended for duration of linkprober_suspend_timer when transition from LinkManager transition from Active to Unknown state. | | | | positive_signal_count | event count to confirm a positive state transition, i.e. unknown to active. | -| | | negative_signal_count | event count to confirm a negative state transtion, i.e. active to unknown. | +| | | negative_signal_count | event count to confirm a negative state transition, i.e. active to unknown. | | Localhost | MUX_DRIVER | | | | | | i2c_retry_count | number of I2C retries before driver announce MUX entering failure state | | MUX_CABLE | \ | | | @@ -135,7 +135,7 @@ Able to switch to a healthy link/ToR when there is a link failure. | HW_MUX_CABLE_TABLE | \ | | | | | | state | "active\|standby\|unknown"; Written by ycabled. "unknown" when MUX fails to respond to I2C commands for i2c_retry_count. | | MUX_LINKMGR_TABLE | \ | | | -| | | state | "healthy, unhealthy, uninitialized"; Written by linkmgrd to reflect current combined mux state and link prober state. "unintialized" indicates that SM for this port did not complete initialization and is considered unhealthy for auto failover. | +| | | state | "healthy, unhealthy, uninitialized"; Written by linkmgrd to reflect current combined mux state and link prober state. "uninitialized" indicates that SM for this port did not complete initialization and is considered unhealthy for auto failover. | | MUX_METRICS_TABLE | \ | | | | | | \_switch_\_start | “yyyy-mmm-dd hh:mm:ss.uuuuuu”; time when \ starts switch operation to state \. \ is one of ycabled, orchagent, or linkmgrd and \ is active or standby. | | | | \_switch_\_end | “yyyy-mmm-dd hh:mm:ss.uuuuuu”; time when \ completes switch operation to state \. \ and \ are as defined above. | diff --git a/doc/dualtor/multiple_nexthop_route_hld.md b/doc/dualtor/multiple_nexthop_route_hld.md index 4c5871cccee..65deaab58a9 100644 --- a/doc/dualtor/multiple_nexthop_route_hld.md +++ b/doc/dualtor/multiple_nexthop_route_hld.md @@ -22,7 +22,7 @@ We will be adding a function `updateRoute()` to muxorch which will handle refres If `updateRoute()` finds 1 nexthop corresponding to the given route, it will be a no-op and the current behavior will be maintained. -If `updateRoute()` finds more than 1 nexthop corresponding to the given route, then it will set the hardware route according to the folowing logic using `sai_route_api`: +If `updateRoute()` finds more than 1 nexthop corresponding to the given route, then it will set the hardware route according to the following logic using `sai_route_api`: 1. If a nexthop neighbor is in Standby: do nothing 2. If a nexthop neighbor is in Active: Set the first active neighbor as the sole nexthop and return. @@ -133,7 +133,7 @@ Several assumptions are made as a part of this design. This means that there are - As part of this design, we are assuming that within a nexthop group, nexthop neighbors are either ALL "Mux" nexthops, or None of the neighbors are "Mux" nexthops. Due to this assumption, we will not support mixed "Mux" and "non-Mux" nexthop neighbors within a nexthop group. -- Since only one nexthop will be programmed to the ASIC, all ECMP neighbors with a mux will fall back to 1 active nexthop neigbor or 1 tunnel route. +- Since only one nexthop will be programmed to the ASIC, all ECMP neighbors with a mux will fall back to 1 active nexthop neighbor or 1 tunnel route. ## Testing diff --git a/doc/dynamic-port-breakout/sonic-dynamic-port-breakout-HLD.md b/doc/dynamic-port-breakout/sonic-dynamic-port-breakout-HLD.md index 4cfa0041166..32934ede37e 100644 --- a/doc/dynamic-port-breakout/sonic-dynamic-port-breakout-HLD.md +++ b/doc/dynamic-port-breakout/sonic-dynamic-port-breakout-HLD.md @@ -137,7 +137,7 @@ The new naming converntion like [SONiC port naming] (https://github.com/yxieca/S Also there was considerations of grouping port together for different purposes and also support or mix with FibreChannel ports with different naming convention, this will NOT be part of this design. -For the naming changes, since the port naming will be in `platform.json` etc common files (check below section) and the parser function is common, the design will allow you to easiy improve this later. +For the naming changes, since the port naming will be in `platform.json` etc common files (check below section) and the parser function is common, the design will allow you to easily improve this later. ## Platform capability design A capability file for a platform with port breakout capabilities is provided. It defines the initial configurations like lanes, speed, alias etc. This file will be used for CLI later to pick the parent port and breakout mode. It can be used for speed checks based on the current mode. It (in conjunction with `hwsku.json` talked later) will also replace the functionality of the current existing port_config.ini. @@ -247,7 +247,7 @@ After dynamic port breakout support, we won’t need different HWSKUs for the sa The `default_brkout_mode` mode should be one of the modes in `brkout_mode` from the same port in `platform.json`. -The file will work in conjuction with the `platform.json` at platform level, and it will be used for port breakout during the initialization phase. +The file will work in conjunction with the `platform.json` at platform level, and it will be used for port breakout during the initialization phase. Above `platform.json` and `hwsku.json` files will deprecate the old port_config.ini file in the current SONiC design. @@ -444,7 +444,7 @@ High level configuration flow is as below: Interfaces in the figure: 1. CLI will utilize `platform.json` and configDB to find out what ports to be deleted and then generate the individual port configurations into configDB. When port-breakout was changed successfully, CLI will update the configDB for `BREAKOUT_CFG` table. -2. CLI will call the config management library to load configDB data into Yang instance data after generic translation, then during the port delete, it will find all the dependencies and optionally remove them automatically. It then translates the Yang instance data to configDB data. This interface is also used for config validate like syntax checks and dependency checks, whenever we are pushing data to configDB, e,g, during the port adding process. Note: CLI is also leaveraged to verfiy the port deletion completeness before adding the new ports. +2. CLI will call the config management library to load configDB data into Yang instance data after generic translation, then during the port delete, it will find all the dependencies and optionally remove them automatically. It then translates the Yang instance data to configDB data. This interface is also used for config validate like syntax checks and dependency checks, whenever we are pushing data to configDB, e,g, during the port adding process. Note: CLI is also leaveraged to verify the port deletion completeness before adding the new ports. 3. CLI will use existing RedisDB APIs or utilities to read/write data to configDB. 4. This is the RedisDB interface. @@ -1187,7 +1187,7 @@ remove_next_hop remove_neighbor_entry ``` -All thee attribute that coud be changed in orchagent should be able to be brought back to default state. e,g, set the attribute to null. +All thee attribute that could be changed in orchagent should be able to be brought back to default state. e,g, set the attribute to null. # Warm reboot support Syncd changes are required as mentioned above. The PR need to be merged and tested. @@ -1202,7 +1202,7 @@ As part of enabling CMIS FSM with port breakout, found out that port breakout fe A NxS breakout cable inserted implies following - A physical port is broken down into N subports (logical ports) - subports are numbered as 8/N i.e. - - For 4x100G breakout optical module inserted in physcial Ethernet port 1 (etp1), implies: Ethernet8, Ethernet10, Ethernet12, Ethernet14 + - For 4x100G breakout optical module inserted in physical Ethernet port 1 (etp1), implies: Ethernet8, Ethernet10, Ethernet12, Ethernet14 - Note: This is done in this manner to keep such assignments uniform across various breakout modes viz.1x, 2x, 4x, 8x - Speed of each subport is S Gpbs - Unique subport# is assigned to each of the N sub-port starting with subport# 1 (and sequentially incrementing with each sub-port) @@ -1215,8 +1215,8 @@ A NxS breakout cable inserted implies following - Configure a unique subport# for each broken-down (logical) port in platform's port_config.ini - - subport# to start with 1 (and increment sequentailly for each logical port) under the same physcial port - - subport# sequence may repeat for logical ports under another physcial port + - subport# to start with 1 (and increment sequentailly for each logical port) under the same physical port + - subport# sequence may repeat for logical ports under another physical port - subport# as 0 (on a port) implies physical port itself (i.e. no port breakout on it) - These subport#s are then parsed and updated in PORT_TABLE of CONFIG redisDB - There would be a unique PORT_TABLE for each logical port @@ -1239,12 +1239,12 @@ A NxS breakout cable inserted implies following - subport 1: Lanes 1,2,3,4 - subport 2: Lanes 5,6,7,8 -- Next, xcvrd to initiate CMIS FSM (state machine) initilization for each logical port. +- Next, xcvrd to initiate CMIS FSM (state machine) initialization for each logical port. Prior to this, xcvrd to perform following steps: - xcvrd to determine Active Lanes (per subport) from the App Advertisement Table of CMIS Spec. - xcvrd to read 'speed', 'subport' and lanes information (of a logical/sub-port) from the PORT_TABLE of CONFIG DB to perform look-up in appl_dict - xcvrd to check Table 6.1 (of CMIS v5.2) to find the desired application (for the inserted optical module) via get_cmis_application_desired() subroutine - - get_application_advertisement() in xcvrd codebase (cmis.py), which eventually formualtes appl_dict + - get_application_advertisement() in xcvrd codebase (cmis.py), which eventually formulates appl_dict - Use the following criteria to determine the right 'key' in appl_dict dictionary for App\ - Use 'speed' and compare it to transeiver's EEPROM HostInterfaceID for App\ (First Byte of Table 6.1) - Use '# of lanes' (i.e. host_lane_count per subport) as determined above and compare it to HostLaneCount for App\ (Third Byte of Table 6.1) diff --git a/doc/ecmp/fine_grained_next_hop_hld.md b/doc/ecmp/fine_grained_next_hop_hld.md index 389fe8b40b8..43b168c1675 100644 --- a/doc/ecmp/fine_grained_next_hop_hld.md +++ b/doc/ecmp/fine_grained_next_hop_hld.md @@ -55,9 +55,9 @@ This document describes the high level design of a Fine Grained ECMP feature as ![](../../images/ecmp/use_case.png) Firewall or other applications running on loadbalanced VMs which maintain state of flows running through them, such that: -- There is shared state amongst some set of firewalls so that flows can be recovered if a flow transistions from 1 firewall to another. In the +- There is shared state amongst some set of firewalls so that flows can be recovered if a flow transitions from 1 firewall to another. In the prefix-based match mode, there is only 1 bank of firewalls and either all the firewalls share the same state or none of them do. In route/nexthop - mode, the number of banks can be configured and all the firewalls in the same firewall set(bank) will share state. Flow transistions can occur + mode, the number of banks can be configured and all the firewalls in the same firewall set(bank) will share state. Flow transitions can occur when next-hops are added/withdrawn. In this example for the route/nexthop match-mode, Firewall 1,2,3 form a firewall set(bank) - Flow recovery is expensive so we should limit the flow redistributions which occur during next-hop addition and removal - Given that not all firewalls share state, there is a need to redistribute flows only amongst the firewalls which share state @@ -310,7 +310,7 @@ Following orchagents shall be modified. Flow diagrams are captured in a later se - fgnhgorch ### routeorch - This is the swss orchestrator responsible for pushing routes down to the ASIC. It creates ECMP groups in the ASIC for cases where there are multiple next-hops. It also adds/removes next-hop members as neighbor availability changes(link up and down scnearios). It will evoke fgnhgorch for all routes which desire special ecmp behavior. + This is the swss orchestrator responsible for pushing routes down to the ASIC. It creates ECMP groups in the ASIC for cases where there are multiple next-hops. It also adds/removes next-hop members as neighbor availability changes(link up and down scenarios). It will evoke fgnhgorch for all routes which desire special ecmp behavior. ### fgnhgorch This is the swss orchestrator which receives FG_NHG entries and identifies the exact way in which the hash buckets need to be created and assigned at the time of BGP route modifications. For BGP route modifications/next-hop changes, fgnhgorch gets evoked by routeorch. It creates ecmp groups with the new SAI components in Table 3 and will be the orchestrator responsible for achieving the use cases highlighted above by modifying hash buckets in a special manner. Fgnhgorch will also be an observer for SUBJECT_TYPE_PORT_OPER_STATE_CHANGE from portsorch, this will allow operational state changes for links to be reflected in the ASIC per fine grained behavior. @@ -345,7 +345,7 @@ The below table represents main SAI attributes which shall be used for Fine Grai - A key idea in achieving consistent ecmp and limiting redistributions to a bank(group) is the creation of many hash buckets(SAI_OBJECT_TYPE_NEXT_HOP_GROUP_MEMBER) associated with an ecmp group and having a next-hop repeated multiple times within it. - Now if a next-hop were to go down we would only change the hash buckets which are affected by the next-hop down event. This allows us to ensure that all flows are not affected by a next-hop change, thereby achieving consistent hashing - Further, in the route/nexthop match modes, by pushing configuration with next-hop bank membership, we can ensure that we only refill the affected hash buckets with those next-hops within the same bank. Thus achieving consistent hashing within a bank itself and meeting the requirement/use case above. -- A distiction is made between Kernel routes and hardware routes for fine grained ECMP. The kernel route contains the prefix along with standard next-hops as learnt via BGP or any other means. Fine Grained ECMP takes that standard route(as pushed via APP DB + routeorch) and then creates a fine grained ECMP group by expanding it into the hash bucket membership. Further the kernel route and hw route are not equivalent due to the special redistribution behavior with respect to the bank defintion. Special logic is also present in route/nexthop modes to ensure that any next-hops which don't match the static FG_NHG next-hop set for a prefix will cause the next-hop to be ignored to maintain consistency with the desired hw route and hashing state defined in FG_NHG. FG_NHG drives the final state of next-hop groups in the ASIC given a user programs the config_db entry for it. +- A distinction is made between Kernel routes and hardware routes for fine grained ECMP. The kernel route contains the prefix along with standard next-hops as learnt via BGP or any other means. Fine Grained ECMP takes that standard route(as pushed via APP DB + routeorch) and then creates a fine grained ECMP group by expanding it into the hash bucket membership. Further the kernel route and hw route are not equivalent due to the special redistribution behavior with respect to the bank definition. Special logic is also present in route/nexthop modes to ensure that any next-hops which don't match the static FG_NHG next-hop set for a prefix will cause the next-hop to be ignored to maintain consistency with the desired hw route and hashing state defined in FG_NHG. FG_NHG drives the final state of next-hop groups in the ASIC given a user programs the config_db entry for it. - Given that fgnhgorch can ignore next-hops in route addition in order to maintain consistency with FG_NHG, special syslog error messages will be displayed whenever fgnhgorch skips propagation of a next-hop to the ASIC. - A guideline for the hash bucket size is to define a bucket size which will allow equal distribution of traffic regardless of the number of next-hops which are active. For example with 2 Firewall sets, each set containing 3 firewall members: each set can have equal redistribution by finding the lowest common multiple of 3 next-hops which is 3x2x1(this is equivalent to us saying that if there were 3 or 2 or 1 next-hop active, we could distribute the traffic equally amongst the next-hops). With 2 such sets we get a total of 3x2x1 + 3x2x1 = 12 hash buckets. - fgnhgorch is an observer for SUBJECT_TYPE_PORT_OPER_STATE_CHANGE events, these events are used in conjunction with the IP to interface mapping(INTERFACE attribute of the FG NHG member table), to trigger next-hop withdrawal/addition depending on which interface's operational state transitioned to down/up. The next-hop withdrawal/addition is performed per consistent and layered hashing rules. The INTERFACE attribute is optional, so this functionality is activated based on user configuration. @@ -518,15 +518,15 @@ A new test called test_fgnhg.py will be created to test FG_NHG configurations. T Test details: ### Route-based/Nexthop-based Match Mode - Create FG_NHG config_db entry with 2 banks, 3 members per bank -- Create 6 interfaces with IPs, and program an APP_DB route with IP prefix + 6 next-hops matching above config_db entry: check if hash buckets are created as expected adhereing to the bank defintions +- Create 6 interfaces with IPs, and program an APP_DB route with IP prefix + 6 next-hops matching above config_db entry: check if hash buckets are created as expected adhereing to the bank definitions - APP_DB route modified to reduce some number of next-hops: check if ASIC_DB hash bucket members show that the swss code maintains layered and consistent hashing - APP_DB route modified to remove all next-hops in bank0: check if ASIC_DB hash bucket members show that members of bank1 take the place bank0 members - APP_DB route modified to add 1st next-hop to bank0: check if ASIC_DB hash bucket members show that the added next-hop member takes up all the hash buckets assigened to the bank0 - Test both IPv4 and IPv6 above -- Disable a link from the link mapping created in FG_NHG_MEMBER and validate that hash buckets were redistributed in the same bank and occured in a consistent fashion -- Test dynamic changes to the config_db bank + member defintion +- Disable a link from the link mapping created in FG_NHG_MEMBER and validate that hash buckets were redistributed in the same bank and occurred in a consistent fashion +- Test dynamic changes to the config_db bank + member definition - Change ARP(NEIGH)/interface reachability and validate that ASIC_DB hash bucket members are as expected(ie: maintaining layered and consistent hashing) -- Test warm reboot and ensure that Fine Grained ECMP entries in the ASIC are identical post warm reboot. Ensure that nexthop modifications post warm reboot yeild expected changes in hash buckets. +- Test warm reboot and ensure that Fine Grained ECMP entries in the ASIC are identical post warm reboot. Ensure that nexthop modifications post warm reboot yield expected changes in hash buckets. - Run the above set of tests for both nexthop-based and route-based match_modes. Additionally, for nexthop-based matchmode, validate changes in asic objects for route transitions from fine grained ecmp to regular ecmp and vice-versa. The route transition can occur because a route points to one set of nexthops which are fine grained, and the route may change later to point to nexthops which are non-fine grained and vice-versa. We validate these cases and the resulting ASIC DB objects. ### Prefix-based Match Mode @@ -534,7 +534,7 @@ Test details: - Create 6 interfaces with IPs, and program an APP_DB route with IP prefix + 3 next-hops: check if hash buckets are created as expected. - APP_DB route modified to remove a next-hop: check if ASIC_DB hash bucket members show that the swss code maintains layered and consistent hashing. - APP_DB route modified to add two next-hops and remove a next-hop: check if ASIC_DB hash bucket members show that the swss code maintains layered and consistent hashing. -- Test warm reboot and ensure that Fine Grained ECMP entries in the ASIC are identical post warm reboot. Ensure that nexthop modifications post warm reboot yeild expected changes in hash buckets. +- Test warm reboot and ensure that Fine Grained ECMP entries in the ASIC are identical post warm reboot. Ensure that nexthop modifications post warm reboot yield expected changes in hash buckets. - Create another FG_NHG config_db entry with a different prefix, bucket_size and max_next_hops_value. - Add an APP_DB route for the 2nd prefix that shares a next-hop with the 1st prefix: ensure the hash buckets for hardware nexthop groups corresponding to both prefixes are created correctly. - Bring down and then bring up a next-hop common to both prefixes: Check that ASIC_DB hash buckets are updated correctly and only the minimum number of buckets changed in both groups. @@ -550,11 +550,11 @@ Test details: - Create a route entry with 8 IPs as the next-hop, and an IP prefix as defined in FG_NHG, deploy it to the DUT - Pytest will now evoke the fine grained ECMP PTF test to send 1000 unique flows from the T1 interface destined to the unique IP prefix - Track which link receives which flow and store the mapping of flow to link -- Change the DUT route entry to reduce 1 next-hop, validate that flows were redistributed in the same bank and occured in a consistent fashion -- Change the DUT route entry to add 1 next-hop, validate that flows were redistributed in the same bank and occured in a consistent fashion +- Change the DUT route entry to reduce 1 next-hop, validate that flows were redistributed in the same bank and occurred in a consistent fashion +- Change the DUT route entry to add 1 next-hop, validate that flows were redistributed in the same bank and occurred in a consistent fashion - Change the DUT route entry to have all next-hops in a bank0 as down, make sure that the traffic now flows to links in bank1 only - Change the DUT route entry to add 1st next-hop in a previously down bank0, now some of the flows should migrate to the newly added next-hop -- Disable a link from the link mapping created in FG_NHG_MEMBER and validate that flows were redistributed in the same bank and occured in a consistent fashion +- Disable a link from the link mapping created in FG_NHG_MEMBER and validate that flows were redistributed in the same bank and occurred in a consistent fashion - Validate that in all cases the flow distribution per next-hop is roughly equal - Test both IPv4 and IPv6 above - The above test is configured via config_db entries directly, a further test mode to configure Fine Grained ECMP via minigraph will be present and tested diff --git a/doc/ecmp/inner_packet_hashing_test_plan.md b/doc/ecmp/inner_packet_hashing_test_plan.md index eef3dfb94b7..ebdd1a71fb5 100644 --- a/doc/ecmp/inner_packet_hashing_test_plan.md +++ b/doc/ecmp/inner_packet_hashing_test_plan.md @@ -36,7 +36,7 @@ The purpose of this test plan is to describe inner packet hashing tests for ECMP ## Test information ### Supported topology -The test will be supported on the T0 toplogy(add and verify others), it may be enhanced in the future for other topologies. +The test will be supported on the T0 topology(add and verify others), it may be enhanced in the future for other topologies. ### Test configuration The inner hashing configuration to the DUT done by dynamic Policy Base Hashing feature. @@ -117,7 +117,7 @@ pbh_table vxlan_ipv6_ipv6 1 ether_type: 0x86dd inner_hash S ### High level test details 1. Send packets to a destination prefix which is pointing to multiple ecmp nexthops. For the T0 topology test we will send it to a dest prefix which is pointing to the T1s. -3. Vary some tuple of the packet so that the packets hash to different nexthops. The total packets sent in a test is calcualted as follows: 1000 packets sent per ECMP next hop. This translates to 4000 packets in a T0 topology with 4 T1s(ECMP nexthops). All 4000 packets will have varied tuples to get a good distribution of packets to ports. +3. Vary some tuple of the packet so that the packets hash to different nexthops. The total packets sent in a test is calculated as follows: 1000 packets sent per ECMP next hop. This translates to 4000 packets in a T0 topology with 4 T1s(ECMP nexthops). All 4000 packets will have varied tuples to get a good distribution of packets to ports. 4. Identify set of ports on which the packet would have ecmp'd 5. Check which port received the packet and record received packet count per port 6. Calculate the expected number of packets per port diff --git a/doc/ecmp/ordered_ecmp_next_hop_hld.md b/doc/ecmp/ordered_ecmp_next_hop_hld.md index f962a0e847f..1dc1f18c329 100644 --- a/doc/ecmp/ordered_ecmp_next_hop_hld.md +++ b/doc/ecmp/ordered_ecmp_next_hop_hld.md @@ -17,7 +17,7 @@ * [1 Requirements Overview](#1-requirements-overview) * [1.1 Use Case](#11-use-case) - * [1.2 Acheiving Order Nexthop member in ECMP](#12-acheiving-order-nexthop-memeber-in-ecmp) + * [1.2 Achieving Order Nexthop member in ECMP](#12-acheiving-order-nexthop-memeber-in-ecmp) * [1.3 Functional requirements](#13-functional-requirements) * [2 Modules Design](#2-modules-design) * [2.1 App DB](#21-app-db) @@ -43,19 +43,19 @@ This document talks about use-case to support ECMP with Ordered Nexthop and chan # 1 Requirements Overview ## 1.1 Use case Under the ToR (Tier0 device) there can be appliances (eg:Firewall/Software-Load Balancer) which maintain state of flows running through them. For better scaling/high-availaibility/fault-tolerance -set of appliances are used and connected to differnt ToR's. Not all the flow state that are maintained by these appliances in a set are shared between them. Thus with flow state not being sync +set of appliances are used and connected to different ToR's. Not all the flow state that are maintained by these appliances in a set are shared between them. Thus with flow state not being sync if the flow do not end up alawys on to same TOR/Appliance it can cause services (using that flow) degradation and also impact it's availability -To make sure given flow (identidied by 5 tuple) always end up on to same TOR/Appliance we need ECMP ordered support/feature on T1 (Leaf Router). -With this feature enable even if flow land's on different T1's (which is common to happen as some link/device in the flow path goes/come to/from maintainence) -ECMP memeber being ordered will use same nexthop (T0) and thus same appliace. +To make sure given flow (identified by 5 tuple) always end up on to same TOR/Appliance we need ECMP ordered support/feature on T1 (Leaf Router). +With this feature enable even if flow land's on different T1's (which is common to happen as some link/device in the flow path goes/come to/from maintenance) +ECMP member being ordered will use same nexthop (T0) and thus same appliace. Below diagram captures the use-case (Traffic is flowing from T1 <-> T0 <-> Appliance) ![](../../images/ecmp/order_ecmp_pic.png) -## 1.2 Acheiving Order Nexthop member in ECMP +## 1.2 Achieving Order Nexthop member in ECMP 1. Nexthop's will be sorted based on their IP address to get their order within the ECMP group. In typical data-center ip address allocation scheme all T1’s in a given podset/cluster have the same order for P2P v4/v6 IP Address for all downstream T0's. -2. This feature/enhacement assumes entropy calculation will be same for a given flow on each devices that have set set of nexthop in the ECMP Group. +2. This feature/enhancement assumes entropy calculation will be same for a given flow on each devices that have set set of nexthop in the ECMP Group. 3. This feature/enhancement is best effort in nature where if the Links/Bgp between pair of devices are not in same state (either Up/Down) then flow can take different path. ## 1.3 Functional requirements @@ -64,13 +64,13 @@ This section describes the SONiC requirements for Ordered ECMP Nexthop At a high level the following should be supported: Phase #1 -- Program ECMP memebers (nexthops) in ordered way. Above use case is for ECMP Group on T1 with nexthop memebers as T0 but requirement is generic for any ECMP Group/Tier +- Program ECMP members (nexthops) in ordered way. Above use case is for ECMP Group on T1 with nexthop members as T0 but requirement is generic for any ECMP Group/Tier - Knob to enable/disable the order ecmp nexthop - Maintain Backward Compatible if given SAI Vendor can not support ordered ecmp - Should work with Overlay ECMP. - Handling linkdown/linkup scenarios which triggers nexthop withdrawal/addition to nexthop group. -Phase #2 (Not commited as of now) +Phase #2 (Not committed as of now) - Init time knob to configure key/parameter to use for creating ordered nexthop (default being nexthop ip address) - Warm restart support (if/when enable on T0) - Config DB based knob to enable/disable order ecmp feature. This might need system reboot diff --git a/doc/error-handling/error_handling_design_spec.md b/doc/error-handling/error_handling_design_spec.md index fdc28b5d32d..3959f88cd1b 100755 --- a/doc/error-handling/error_handling_design_spec.md +++ b/doc/error-handling/error_handling_design_spec.md @@ -120,7 +120,7 @@ Following diagram describes a high level overview of the BGP use case: ## 3.1 Error Database -As SONIC architecture relies on the use of centralized Redis-database as means of multi-process communication among all subsystems, the framework re-uses the same mechanism to notify errors back to applications. +As SONIC architecture relies on the use of centralized Redis-database as means of multi-process communication among all subsystems, the framework reuses the same mechanism to notify errors back to applications. A new database, ERROR_DB, is introduced to store the details of failed entries/objects corresponding to various tables. The ERROR_DB tables are defined in application friendly format. Applications can register as consumer of ERROR_DB table to receive error notifications, whereas OrchAgent is registered as producer of ERROR_DB table. If the SAI CREATE/SET method fails, Syncd informs OrchAgent using the notification channel of ASIC_DB. OrchAgent is responsible to translate the ASIC_DB notification and store it in ERROR_DB format. It is also responsible to map the SAI specific error codes to SWSS error codes. @@ -393,8 +393,8 @@ The unit test plan for error handling framework is documented below: | | 1.3 | Verify multiple applications registering for ROUTE table notifications. Generate failure event and verify all registered applications are notified. | | | 1.4 | Verify multiple applications de-register for ROUTE table notifications. Generate failure event and verify only de-registered application is NO LONGER notified. Other registered applications continue to get notified. | | | 1.5 | Verify that the notification for IPv4/IPv6 ROUTE entry contains all the required parameters as defined by the schema - Prefix/Nexthops/Opcode/Failure code. | -| | 1.6 | Verify error is notified incase of IPv4/IPv6 ROUTE add failure due to TABLE full condition. Verify entry exists in ERROR_DB for failed route with Opcode=Add and Error=Table Full. | -| | 1.7 | Verify error is notified incase of IPv4/IPv6 ROUTE add failure due to ENTRY_EXISTS condition. Verify entry exists in ERROR_DB for failed route with Opcode=Add and Error=Entry Exists. | +| | 1.6 | Verify error is notified in case of IPv4/IPv6 ROUTE add failure due to TABLE full condition. Verify entry exists in ERROR_DB for failed route with Opcode=Add and Error=Table Full. | +| | 1.7 | Verify error is notified in case of IPv4/IPv6 ROUTE add failure due to ENTRY_EXISTS condition. Verify entry exists in ERROR_DB for failed route with Opcode=Add and Error=Entry Exists. | | | 1.8 | Verify application is notified even in case of IPv4/IPv6 ROUTE is successfully programmed (NO_ERROR). Verify that there is NO entry for the route in ERROR_DB. | | | 1.9 | Verify error is notified in case of IPv4/IPv6 ROUTE deletion failure due to NOT_FOUND. Verify that there is NO entry for the failed route in ERROR_DB. | | | 1.10 | Verify that the failed IPv4/IPv6 ROUTE entry in ERROR_DB is cleared, when application deletes that entry. Verify other failed entries in ERROR_DB are retained. | @@ -404,8 +404,8 @@ The unit test plan for error handling framework is documented below: | | 2.3 | Verify multiple applications registerting for Neighbor table notifications. Generate failure event and verify all registered applications are notified. | | | 2.4 | Verify multiple applications de-register for Neighbor table notifications. Generate failure event and verify only de-registered application is NO LONGER notified. Other registered applications continue to get notified. | | | 2.5 | Verify that the notification for IPv4/IPv6 Neighbor entry contains all the required parameters as defined by the schema - Ifname/Prefix/Opcode/Failure code. | -| | 2.6 | Verify error is notified incase of IPv4/IPv6 NEIGHBOR add failure due to TABLE full condition. Verify entry exists in ERROR_DB for failed neighbor with Opcode=Add and Error=Table Full. | -| | 2.7 | Verify error is notified incase of IPv4/IPv6 NEIGHBOR add failure due to ENTRY_EXISTS condition. Verify entry exists in ERROR_DB for failed neighbor with Opcode=Add and Error=Entry Exists. | +| | 2.6 | Verify error is notified in case of IPv4/IPv6 NEIGHBOR add failure due to TABLE full condition. Verify entry exists in ERROR_DB for failed neighbor with Opcode=Add and Error=Table Full. | +| | 2.7 | Verify error is notified in case of IPv4/IPv6 NEIGHBOR add failure due to ENTRY_EXISTS condition. Verify entry exists in ERROR_DB for failed neighbor with Opcode=Add and Error=Entry Exists. | | | 2.8 | Verify application is notified even in case of IPv4/IPv6 NEIGHBOR is successfully programmed (NO_ERROR). Verify that there is NO entry for the neighbor in ERROR_DB in this case. | | | 2.9 | Verify error is notified in case of IPv4/IPv6 NEIGHBOR deletion failure due to NOT_FOUND. Verify that there is NO entry in ERROR_DB in this case. | | | 2.10 | Verify that the failed IPv4/IPv6 NEIGHBOR entry in ERROR_DB is cleared, when application deletes that entry. Verify other failed entries in ERROR_DB are retained. | @@ -426,4 +426,4 @@ The unit test plan for error handling framework is documented below: If Tid is included in the application request, framework includes the same in the notification. Otherwise, the notification is still sent, but without Tid. This enables migration of applications to start using Tid on a need basis. -- Extend error handling to other tables in the sytem (VLAN/LAG/Mirror/FDB etc). +- Extend error handling to other tables in the system (VLAN/LAG/Mirror/FDB etc). diff --git a/doc/event-alarm-framework/event-alarm-framework.md b/doc/event-alarm-framework/event-alarm-framework.md index 9238b32544e..c4332944d18 100644 --- a/doc/event-alarm-framework/event-alarm-framework.md +++ b/doc/event-alarm-framework/event-alarm-framework.md @@ -325,7 +325,7 @@ e.g., Sensor temperature critical high ### 3.1.2 Event Consumer The event consumer is a class in EventDB service that processes the incoming events. -On intitialization, event consumer reads */etc/evprofile/default.json* and builds an internal map of events, called *static_event_map*. +On initialization, event consumer reads */etc/evprofile/default.json* and builds an internal map of events, called *static_event_map*. It then subscribes to zmqproxy for events. On reading the event, using the event-id in the record, event consumer fetches static information from *static_event_map*. @@ -392,7 +392,7 @@ Alarm table is empty. All counters in ALARM_STATS is 0. System LED is Green. | ALM-1 | CRITICAL | | | ALM-2 | MINOR | | -Alarm table now has two alarms. One with *CRITICAL* and other with *MINOR*. ALARM_STATS is updated as: Critical as 1 and Minor as 1. As There is atleast one alarm with *critical/major* severity, system LED is Red. +Alarm table now has two alarms. One with *CRITICAL* and other with *MINOR*. ALARM_STATS is updated as: Critical as 1 and Minor as 1. As There is at least one alarm with *critical/major* severity, system LED is Red. | alarm | severity | acknowledged | |:-----:|:----------:|:------------:| @@ -412,7 +412,7 @@ Now there is an alarm with *MAJOR* severity. ALARM_STATS now reads as: Major as | ALM-2 | MINOR | | | ALM-9 | MAJOR | true | -The *MAJOR* alarm is acknowledged by user, alarm consumer sets *acknolwedged* flag to true and reduces Major counter in ALARM_STATS by 1, ALARM_STATS now reads as: Major 0 and Minor 1. This way, acknowledged major alarm has no effect on system LED. There are no other *CRITICAL/MAJOR* alarms. There however, exists an alarm with *MINOR/WARNING* severity. System LED is Amber. +The *MAJOR* alarm is acknowledged by user, alarm consumer sets *acknowledged* flag to true and reduces Major counter in ALARM_STATS by 1, ALARM_STATS now reads as: Major 0 and Minor 1. This way, acknowledged major alarm has no effect on system LED. There are no other *CRITICAL/MAJOR* alarms. There however, exists an alarm with *MINOR/WARNING* severity. System LED is Amber. | alarm | severity | acknowledged | |:-----:|:----------:|:------------:| @@ -804,7 +804,7 @@ openconfig alarms yang is defined [here](https://github.com/openconfig/public/b ``` sonic# alarm acknowledge ``` -An operator can acknolwedge a raised alarm. This indicates that the operator is aware of the fault condition and considers the condition not catastrophic. +An operator can acknowledge a raised alarm. This indicates that the operator is aware of the fault condition and considers the condition not catastrophic. Acknowledging an alarm updates alarm statistics and thereby applications like pmon can remove the particular alarm from status consideration. The alarm record in the ALARM table is marked with acknowledged field set to true. There is acknowledge-time field that indicates when that alarm is acknowledged. @@ -945,7 +945,7 @@ Id Action Severity Name Timestamp sonic# show alarm [ acknowledged | all | detail | summary | severity | id | start end | recent <5min|1hr|1day> | from to ] -'show alarm' command would display all the *active* alarm records in ALARM table. Acknowledged alarms wont be shown here. +'show alarm' command would display all the *active* alarm records in ALARM table. Acknowledged alarms won't be shown here. sonic# show alarm ---------------------------------------------------------------------------------------------------------------------------- @@ -1057,7 +1057,7 @@ The second command displays all the alarms that are waiting to be cleared by app # 7 Unit Test - Raise an event and verify the fields in EVENT table and EVENT_STATS table - Raise an alarm and verify the fields in ALARM table and ALARM_STATS table -- Clear an alarm and verify that record is removed from ALARM and ALARM_STATS tables are udpated +- Clear an alarm and verify that record is removed from ALARM and ALARM_STATS tables are updated - Ack an alarm and verify that acknowledged flag is set to true in ALARM table and acknowledge-time is set - Un-Ack an alarm and verify that acknowledged flag is set to false in ALARM table and acknowledge-time is set - Verify wrap around for EVENT table ( change manifest file to a lower range and trigger that many events ) diff --git a/doc/event-alarm-framework/events-producer.md b/doc/event-alarm-framework/events-producer.md index 48a4e9ce49e..cd6de6360cf 100644 --- a/doc/event-alarm-framework/events-producer.md +++ b/doc/event-alarm-framework/events-producer.md @@ -223,7 +223,7 @@ The event will now be published as below per schema. The instance data would ind ## gNMI client A gNMI client could subscribe for events in streaming mode. -At the rate of 10K/second and to conserve switch resources, only one gNMI client is supported and hence all events are sent to the client with no additonal filtering. It is expected that the client will save events in a an external storage and consumer clients can watch/query from the external resource with filters. +At the rate of 10K/second and to conserve switch resources, only one gNMI client is supported and hence all events are sent to the client with no additional filtering. It is expected that the client will save events in a an external storage and consumer clients can watch/query from the external resource with filters. Below shows the command & o/p for subscribing all events. ``` gnmic --target events --path "/events/" --mode STREAM --stream-mode ON_CHANGE @@ -314,7 +314,7 @@ The libswsscommon will have the APIs for publishing & receiving. ## exporter 1. Telemetry container runs a gNMI server to export events to external receiver/collector via SUBSCRIBE request. 2. Telemetry container sends all the events to the receiver in FIFO order. -3. Telemetry container ensures atleast one event sent every N seconds, by sending a heartbeat/no-op event when there are no events to publish. +3. Telemetry container ensures at least one event sent every N seconds, by sending a heartbeat/no-op event when there are no events to publish. 4. Telemetry container uses an internal buffer, when local publishing rates overwhelms the receiver. - Internal buffer overflow will cause new events to be dropped. - The dropped events are counted and recorded in STATE-DB via stats. @@ -323,7 +323,7 @@ The libswsscommon will have the APIs for publishing & receiving. - A long downtime can result in message drop due to cache overflow. - A unplanned telemetry service down (say crash) will not use the cache service -5. The stats for maintained for SLA compliance verification. This inlcudes like total count of events sent, missed count, ... +5. The stats for maintained for SLA compliance verification. This includes like total count of events sent, missed count, ... - The stats are collected and recorded in STATE-DB. - An external gNMI client could subscribe for stats table updates' streaming ON-CHANGE. @@ -524,11 +524,11 @@ The event detection could happen in many ways ### Log message based detection At high level: 1. This is a two step process. -2. The process raising the event sends a sylog message out. +2. The process raising the event sends a syslog message out. 3. A watcher scans all the syslog messages emitted and parse/check for events of interest. 4. When matching message arrives, publish the event -Here you have code that sends the log and a watcher who has the regex pattern for that log message to match. Anytime the log messsage is changed the pattern has to be updated for the event to fire consistently across releases. +Here you have code that sends the log and a watcher who has the regex pattern for that log message to match. Anytime the log message is changed the pattern has to be updated for the event to fire consistently across releases. Though this sounds like a redundant/roundabout way, this helps as below. - For III party code, not owned by SONiC, this is an acceptable solution. @@ -542,7 +542,7 @@ Though this sounds like a redundant/roundabout way, this helps as below. - For logs raised by host processes, configure this plugin at host. - For logs raised by processes inside the container, configure plugin inside the container. This helps in container upgrade scenarios and as well help with load distribution. - The plugin can be configured using rsyslog properties to help scale into multiple instances, so a single instance see only a subset of logs pre-filtered by rsyslog. - - A plugin instance could receive messasges **only** for processes that it is configured for. + - A plugin instance could receive messages **only** for processes that it is configured for. - The plugin is provided with the list of regex patterns to use for matching messages. Each pattern is associated with the name of event source and the tag. - The regex pattern is present as files as one per plugin instance, so an instance sees only the regex expressions that it could match. @@ -685,7 +685,7 @@ The message reliability is ensured as BEST effort. There are 3 kinds of missed m # STATS update -The stats are collected and updaed periodically in DB. The stats can be used to assess the performance and SLA (_Service Level Agreement_) compliance.
+The stats are collected and updated periodically in DB. The stats can be used to assess the performance and SLA (_Service Level Agreement_) compliance.
The stats are collected by telemetry service that serves the main receiver. Hence the stats update occur only when main receiver is connected.
- The counters are persisted in STATE-DB with keys as "EVENT-STATS|< counter name >" @@ -716,7 +716,7 @@ The stats are collected by telemetry service that serves the main receiver. Henc # CLI -- Show commands is provided to vew STATS collected +- Show commands is provided to view STATS collected - gnmi cli commands ``` @@ -726,7 +726,7 @@ gnmi_cli -client_types=gnmi -a 127.0.0.1:50051 -t EVENTS -logtostderr -insecure # heartbeat=n sets to every n seconds if n>0. gnmi_cli -client_types=gnmi -a 127.0.0.1:50051 -t EVENTS -logtostderr -insecure -v 7 -streaming_type ON_CHANGE -q all[heartbeat=5] -qt s -# Sets pq max size to be 1000; The q between Telemetry container and the exernal gNMI connection. +# Sets pq max size to be 1000; The q between Telemetry container and the external gNMI connection. gnmi_cli -client_types=gnmi -a 127.0.0.1:50051 -t EVENTS -logtostderr -insecure -v 7 -streaming_type ON_CHANGE -q all[heartbeat=5][qsize=1000] -qt s ``` @@ -779,7 +779,7 @@ namespace SONIC_EVENTS_BGP { ## Event definition enhancements - The consumer of events get the instance data of event as a key-value pair, where key points to the YANG model. -- The YANG schema defintion could be enhanced with additional custom data types created using YANG extensions. +- The YANG schema definition could be enhanced with additional custom data types created using YANG extensions. - An extension could be defined for "severity". The developer of the schema could use this to specify the severity of an event added. - An extension could be defined for globally unique event-id, which could be used by event consumer, when publishing the event to external parties. diff --git a/doc/express-reboot/Cisco_8000_Express_Reboot_HLD.md b/doc/express-reboot/Cisco_8000_Express_Reboot_HLD.md index c7ee257c002..b14a9b46834 100644 --- a/doc/express-reboot/Cisco_8000_Express_Reboot_HLD.md +++ b/doc/express-reboot/Cisco_8000_Express_Reboot_HLD.md @@ -9,7 +9,7 @@ The goal of Sonic express reboot is to be able to restart and upgrade SONiC soft Figure 1. Express boot flow

-It can be seen from Figure 1, it is possible that punt-header-v1 reachs SONIC-v2 or inject-header-v2 reachs NPU-v1 during t1 to t2 window. The punt and inject header changes are rare and not commom. Currently punt and inject header data structure differences between V1 and V2 are handled case by case basis in S1 SDK internally. A more generic and scalable approach for it is being planned and will be shared with the community as an express boot phase-2 once it is finalized. But in principle, maintaining backward compatibility in punt-inject-header if there are any changes is essential for successful express boot upgrades. +It can be seen from Figure 1, it is possible that punt-header-v1 reaches SONIC-v2 or inject-header-v2 reaches NPU-v1 during t1 to t2 window. The punt and inject header changes are rare and not common. Currently punt and inject header data structure differences between V1 and V2 are handled case by case basis in S1 SDK internally. A more generic and scalable approach for it is being planned and will be shared with the community as an express boot phase-2 once it is finalized. But in principle, maintaining backward compatibility in punt-inject-header if there are any changes is essential for successful express boot upgrades. Figure 2 below compares major steps taken in warm boot and express boot from both SONiC and SDK point of view. diff --git a/doc/fast-reboot/Fast-reboot_Flow_Improvements_HLD.md b/doc/fast-reboot/Fast-reboot_Flow_Improvements_HLD.md index 280a9b6336e..899ce2c5fd6 100644 --- a/doc/fast-reboot/Fast-reboot_Flow_Improvements_HLD.md +++ b/doc/fast-reboot/Fast-reboot_Flow_Improvements_HLD.md @@ -27,7 +27,7 @@ The goal of SONiC fast-reboot is to be able to restart and upgrade SONiC software with a data plane disruption less than 30 seconds and control plane less than 90 seconds. With current implementation there is no indication of the fast-reboot status, meaning we don't have a way to determine if the flow has finished or not. -Some feature flows in SONiC are delayed with a timer to keep the CPU dedicated to the fast-reboot init flow for best perforamnce, like enablement of flex counters. +Some feature flows in SONiC are delayed with a timer to keep the CPU dedicated to the fast-reboot init flow for best performance, like enablement of flex counters. In order to have such indicator, re-use of the fastfast-reboot infrastructure can be used. Each network application will experience similar processing flow. @@ -48,7 +48,7 @@ https://github.com/sonic-net/sonic-buildimage/blob/master/dockers/docker-orchage # 2 Functional Requirements -The new Fast-reboot design should meet the following requirments: +The new Fast-reboot design should meet the following requirements: - Reboot the switch into a new SONiC software version using kexec - less than 5 seconds. - Upgrade the switch FW by the new SONiC image if needed. @@ -75,7 +75,7 @@ The restart of syncd docker should leave data plane intact until it starts again Fast-reboot will finish successfully from a different NOS than SONiC with two possible scenarios: - Dump files of default gateway, neighbors and fdb tables are provided to the new image in a format that meet the SONiC scheme, as SONiC does prior the reboot. - - On this scenario all should work exacly the same as the switch rebooted from SONiC to SONiC. + - On this scenario all should work exactly the same as the switch rebooted from SONiC to SONiC. - Dump files of default gateway, neighbors and fdb tables are not provided to the new image as SONiC does prior the reboot. - On this scenario fast-reboot will finish successfully, but with low performance since all neighbors and fdb entries will be created by the slow path. @@ -148,7 +148,7 @@ Same for FDB entries which will be created by the kernel as well, depends on the When orchagent starts with the new SONiC image, the same infrastructure we use to reconcile fastfast-boot will start. After INIT_VIEW and create_switch functions sent to syncd (reset of the ASIC took place here), 'warmRestoreAndSyncUp' will be executed. This function will populate m_toSync with all tasks for syncd, by APP DB and CONFIG DB prior the reboot. -To verify orchagent reached the same state as before the reboot, 'warmRestoreValidation' will verify no pending tasks left in the queue, meaning all proccessed succesfully and in the pipeline for syncd to configure the HW. +To verify orchagent reached the same state as before the reboot, 'warmRestoreValidation' will verify no pending tasks left in the queue, meaning all processed successfully and in the pipeline for syncd to configure the HW. At the end APPLY_VIEW will be sent to syncd to finalize the process, from this point orchagent enter the main loop and operates normally. ### NOTICE @@ -161,7 +161,7 @@ This is solvable by the db migrator. Syncd starts with the fast-reboot flag, trigger the ASIC reset when create_switch is requested from orchagent. In addition, on this case temp view flag will set to false since it is not required, no comparison logic needed since current view is empty. Basically INIT and APPLY view requests from orchagent are ignored by syncd, but bound the process from start to end. -During reconsilations process of orchagent, syncd will recieve all tasks to restore the previous state. +During reconsilations process of orchagent, syncd will receive all tasks to restore the previous state. All other network applications will do the same as we do today for warm-reboot. ![Syncd](/doc/fast-reboot/Orchagent_Syncd.svg) @@ -221,8 +221,8 @@ reboot-finalizer.sh (warm-finalizer.sh) script must also be templatized and upda | /service/fast-shutdown/ | object | no | Fast reboot related properties. Used to generate the fast-reboot script. | | /service/fast-shutdown/after | lits of strings | no | Same as for warm-shutdown. | | /service/fast-shutdown/before | lits of strings | no | Same as for warm-shutdown. | -| /processes | object | no | Processes infromation | -| /processes/[name]/reconciles | boolean | no | Wether process performs warm-boot reconciliation, the warmboot-finalizer service has to wait for. Defaults to False. | +| /processes | object | no | Processes information | +| /processes/[name]/reconciles | boolean | no | Whether process performs warm-boot reconciliation, the warmboot-finalizer service has to wait for. Defaults to False. | This chapter it taken from SONiC Application Extension Infrastructure HLD: diff --git a/doc/fips/SONiC-OpenSSL-FIPS-140-3-deployment.md b/doc/fips/SONiC-OpenSSL-FIPS-140-3-deployment.md index e722984b8ea..44d0e0e2652 100644 --- a/doc/fips/SONiC-OpenSSL-FIPS-140-3-deployment.md +++ b/doc/fips/SONiC-OpenSSL-FIPS-140-3-deployment.md @@ -10,7 +10,7 @@ Table of Contents * [FIPS None Enforce Mode](#FIPS-None-Enforce-Mode) * [FIPS Enforce Mode](#FIPS-Enforce-Mode) * [SONiC FIPS State](#SONiC-FIPS-State) -* [SONiC reboot and upgarde](#SONiC-reboot-and-upgarde) +* [SONiC reboot and upgrade](#SONiC-reboot-and-upgrade) * [SONiC warm-reboot or fast-reboot](#SONiC-warm-reboot-or-fast-reboot) * [SONiC upgrade](#SONiC-upgrade) * [Test cases](#Test-cases) @@ -29,7 +29,7 @@ It is for the security requirement, the FIPS 140-3 feature should be enabled for - Provide a way to enforce the FIPS for SONiC. ## Scopes -1. The FIPS 140-3 is only availabel on SONiC OS Version 11 or above. +1. The FIPS 140-3 is only available on SONiC OS Version 11 or above. 2. FIPS is supported on branches: 202205, 202211, master. ## SONiC Configuration for FIPS @@ -95,7 +95,7 @@ The redis dictionary key is FIPS_STAT\|state. GitHub Pull Request for reference: https://github.com/sonic-net/sonic-host-services/pull/69 -## SONiC reboot and upgarde +## SONiC reboot and upgrade ### SONiC warm-reboot or fast-reboot SONiC ware-reboot/fast-reboot will initialize the kernel command line, it only has impact when the FIPS enforcement flag changed, either from enforce to none-enforce, or from none-enforce to enforce. diff --git a/doc/fips/SONiC-OpenSSL-FIPS-140-3.md b/doc/fips/SONiC-OpenSSL-FIPS-140-3.md index ea42dfe527e..9ec88f4f709 100644 --- a/doc/fips/SONiC-OpenSSL-FIPS-140-3.md +++ b/doc/fips/SONiC-OpenSSL-FIPS-140-3.md @@ -111,7 +111,7 @@ Files in the packages: Kerberos will use the builtin cryptographic module by default, but it allows to change the build option to use OpenSSl, see [MIT Kerberos features](https://web.mit.edu/kerberos/krb5-1.13/doc/mitK5features.html). SONiC will change the build option to use OpenSSL instead of the builtin one. It is not configurable to use the Kerberos builtin cryptographic module when OpenSSL used. ## Golang Cryptographic Module -Golang has its own cryptographic module (see [crypto](https://github.com/golang/go/tree/master/src/crypto)) without FIPS supports. There are some branches with branch name starting with "dev.boringcrypto" (see [golang branches](https://github.com/golang/go/branches/all?query=dev.boringcrypto)), changing the Golang cryptographic APIs' referenece to use [BoringSSL](https://github.com/google/boringssl). Although BoringSSL is an open source project, but it used by Google only, not intened for general use. +Golang has its own cryptographic module (see [crypto](https://github.com/golang/go/tree/master/src/crypto)) without FIPS supports. There are some branches with branch name starting with "dev.boringcrypto" (see [golang branches](https://github.com/golang/go/branches/all?query=dev.boringcrypto)), changing the Golang cryptographic APIs' reference to use [BoringSSL](https://github.com/google/boringssl). Although BoringSSL is an open source project, but it used by Google only, not intened for general use. To support FIPS for Golang, RedHat offers an alternative solution (see [here](https://developers.redhat.com/blog/2019/06/24/go-and-fips-140-2-on-red-hat-enterprise-linux)), it builds on top of the Golang's dev.bringcrypt branches, has ability to call into OpenSSL, not BoringSSL. SONiC can reuse the RedHat sulotion, one difference is that RedHat supports FIPS for OpenSSL directly, SONiC uses OpenSSL Engine. @@ -123,7 +123,7 @@ When FIPS enabled, both of the BoringSSL Enable Option and the SymCrypt Enabled ## Application Impact Some of functions of a application might be broken when using the cryptographic algorithms that are not FIPS compliant. It is relied on the tests of the applications to detect all the impact functions. -For OpenSSH, Centos provides a [patch](https://git.centos.org/rpms/openssh/raw/c8/f/SOURCES/openssh-7.7p1-fips.patch) which is compiant with FIPS 140-2. We can apply the patch and verify if it can pass all the OpenSSH test cases when FIPS enabled. +For OpenSSH, Centos provides a [patch](https://git.centos.org/rpms/openssh/raw/c8/f/SOURCES/openssh-7.7p1-fips.patch) which is compliant with FIPS 140-2. We can apply the patch and verify if it can pass all the OpenSSH test cases when FIPS enabled. ## SONiC FIPS Configuration @@ -135,7 +135,7 @@ grep 'sonic_fips=1' /proc/cmdline There is another parameter fips=1 supported for SymCrypt OpenSSL to enable FIPS. The parameter will enable the Linux Kernel FIPS, but the Linux Kernel FIPS is not supported yet, and it is out of scope in this document. In future, when the FIPS is supported by SONiC Linux Kernel, and the parameter fips=1 has already set, it is not necessary to set sonic_fips=1. -For grub, one of implemetation as below: +For grub, one of implementation as below: cat /etc/grub.d/99-fips.cfg ``` GRUB_CMDLINE_LINUX_DEFAULT="$GRUB_CMDLINE_LINUX_DEFAULT sonic_fips=1" @@ -174,7 +174,7 @@ Support to enable/disable the FIPS feature, the feature is enabled by default in ``` INCLUDE_FIPS ?= y ``` -Support to enable/disable FIPS config, the flage is disabled by default. IF the option is set, then the fips is enabled by default in the image, not necesary to do the config in system level or application level. +Support to enable/disable FIPS config, the flags is disabled by default. IF the option is set, then the fips is enabled by default in the image, not necessary to do the config in system level or application level. ``` ENABLE_FIPS ?= n ``` diff --git a/doc/fips/SONiC-SAI-POST.md b/doc/fips/SONiC-SAI-POST.md index a89847038b2..41ca320665b 100644 --- a/doc/fips/SONiC-SAI-POST.md +++ b/doc/fips/SONiC-SAI-POST.md @@ -8,7 +8,7 @@ ## Table of Contents * [Overview](#Overview) * [Design requirements](#Design-requirements) -* [Deisgn details](#Design-details) +* [Design details](#Design-details) * [State DB](#State-DB) * [Enabling POST in SAI switch init](#Enabling-POST-in-SAI-switch-init) * [Enabling POST in SAI MACSec init](#Enabling-POST-in-SAI-MACSec-init) @@ -28,7 +28,7 @@ The design must meet the following requirements: - POST failure must not affect the operation of non-MACSec ports. - Explicit visibility must be provided if POST fails, for example, in syslog. The syslog message must include the details of the failure. For example, SAI object Id of ports that fail POST and the corresponding MACSec engine. -## Deisgn details +## Design details The following figure depicts the data flow and SONiC components in the design. Orchagent is responsible for triggering POST via SAI calls and publishing POST status in State DB. MACSec container, precisely MACSecMgr, is enhanced to be POST aware and only process MACSec configuration after POST has passed. @@ -85,7 +85,7 @@ Since SAI supports POST completion callback, a callback or notification function If SAI POST fails, MACSecOrch reads POST status of all MACSec ports and finds out which port has failed in POST. MACSecOrch then adds the details of the failure in syslog. The following syslog is added to report SAI POST failure. -Swith level POST failure +Switch level POST failure ``` Switch MACSec POST failed ``` diff --git a/doc/fwutil/fwutil.md b/doc/fwutil/fwutil.md index 11ca4266f3f..b6e960cefdf 100755 --- a/doc/fwutil/fwutil.md +++ b/doc/fwutil/fwutil.md @@ -334,7 +334,7 @@ Chassis1 N/A CPLD /cpld.bin 5 / 10 - `fwutil show updates` command only displays for the components which have the firmware image path available in platform_components.json ``` -**The following command displays the Component FW update satus (only available for `fwutil update all` command):** +**The following command displays the Component FW update status (only available for `fwutil update all` command):** 1. update status ```bash root@sonic:~# fwutil show update status @@ -410,7 +410,7 @@ for automatic FW installation of various platform components. Automatic FW installation requires "platform_components.json" to be created and placed at: _sonic-buildimage/device///platform_components.json_ -Recommanded image path = /lib/firmware// +Recommended image path = /lib/firmware// **Example:** 1. Non modular chassis platform @@ -720,7 +720,7 @@ MSN2700/SSD firmware auto-update starting: /lib/firmware/mlnx/ssd.bin with fast ... SSD firmware auto-update status from 4 to 5: scheduled - installation scheduled for fast reboot ... -MSN2700/CPLD firmware auto-update starting: /lib/firware/mlnx/cpld.bin with fast +MSN2700/CPLD firmware auto-update starting: /lib/firmware/mlnx/cpld.bin with fast ... CPLD firmware auto-update status from 5 to 10: skipped - warm reboot not supported for auto-update All firmware auto-update has been performed. @@ -741,7 +741,7 @@ MSN2700/SSD firmware auto-update starting: /lib/firmware/mlnx/ssd.bin with cold ... SSD firmware auto-update status from 4 to 5: scheduled - installation scheduled for cold reboot ... -MSN2700/CPLD firmware auto-update starting: /lib/firware/mlnx/cpld.bin with cold +MSN2700/CPLD firmware auto-update starting: /lib/firmware/mlnx/cpld.bin with cold ... CPLD firmware auto-update status from 5 to 10: installed - need cold reboot to be completed All firmware auto-update has been performed. @@ -831,7 +831,7 @@ the platform component utility will perform the equivalent process of `auto-upda The component utility can perform the firmware update if the firmware update doesn't need any boot action required after the update. Otherwise, it will create a task file if any process or handling for the component firmware update needs to be done during the reboot and also if the update can be done for the specified reboot type. -The componenet utility should be defined with key value `utility` in the component object of `platform_components.json` to be called by fwutil instead of the platform api. +The component utility should be defined with key value `utility` in the component object of `platform_components.json` to be called by fwutil instead of the platform api. The task file will be platform-specific. **Example:** @@ -877,10 +877,10 @@ Here are the interface requirements to support them. - auto-update interface needs two arguments : image_path and boot_type - response : the return_code that indicates the status of auto-update (please refer to section 2.2.2.4.1) -**Optional) The utility can be supported for other platform api substitues like `compoenent_update` and `compoenent_install` with |-u(--update)|-i(--install)** +**Optional) The utility can be supported for other platform api substitutes like `compoenent_update` and `compoenent_install` with |-u(--update)|-i(--install)** The component utility needs to be called by the FWutil command to perform the firmware auto-update process if it's defined in the `platform_components.json`, otherwise, the platform component api will be called. -The componet utility path will be pased from the `platform_components.json` and be executed by fwutil. +The component utility path will be passed from the `platform_components.json` and be executed by fwutil. Below shows how the utility can be executed for the auto-update interface. ```bash ... @@ -903,7 +903,7 @@ If any specific component firmware update needs to be done only during the reboo Platform firmware update reboot plugin will handle the task during the rebooot and will be invoked by the reboot script with its reboot-type. The plugin is expected to analyze the task file to understand what component firmware update has been scheduled for which reboot and determine if the component firmware update can be performed for the reboot or not. After the determination, firmware update will be done by the plugin if any firmware update is scheduled for the reboot. -If the passed reboot_type to the plugin is different than the boot_type of task file, the pluin should exit with error code so that the reboot script can fail for the error case. +If the passed reboot_type to the plugin is different than the boot_type of task file, the plugin should exit with error code so that the reboot script can fail for the error case. ```bash PLATFORM_FW_AU_REBOOT_HANDLE="platform_fw_au_reboot_handle" diff --git a/doc/gearbox/gearbox_mdio-HLD.md b/doc/gearbox/gearbox_mdio-HLD.md index 5639601e629..f197c9bb854 100755 --- a/doc/gearbox/gearbox_mdio-HLD.md +++ b/doc/gearbox/gearbox_mdio-HLD.md @@ -69,7 +69,7 @@ The Ethernet switches of today often have PHY, re-timer and mux. Some PHY is int ### Requirements -The syncd docker and daemon use the SAI library to service the NPU programming. The SAI library uses PCIe to access the NPU hardware. The gbsyncd docker and daemon use the PAI library to service the external PHY configuration processing. The PAI library usualy uses MDIO to access the PHY hardware. +The syncd docker and daemon use the SAI library to service the NPU programming. The SAI library uses PCIe to access the NPU hardware. The gbsyncd docker and daemon use the PAI library to service the external PHY configuration processing. The PAI library usually uses MDIO to access the PHY hardware. It depends on the switch hardware design that the external PHY could be connected to a FPGA or CPLD based MDIO controller or a switch NPU MDIO bus. The FPGA or CPLD based MDIO controller often has linux kernel driver and provides linux sysfs programming interface. The switch NPU MDIO bus uses SAI library hence an Inter-Process-Communication (IPC) mechanism is required between the syncd daemon and gbsyncd daemon. ![libPAI MDIO access](images/PAI-MDIO.png) @@ -80,9 +80,9 @@ When a configured platform target is built, there is only one syncd docker as th ### Architecture Design -There are many choices for the IPC mechanism between the syncd daemon and gbsyncd daemon. One performance requirement is that it should finsh firmware download within a reasonable time. Our design choice is to use the Unix socket as the IPC mechanism. Our design has the MDIO IPC server in the syncd daemon with its own thread. A new syncd class MdioIpcServer is added to start a new thread, to create an unix socket, to listen on the socket, to accept connection and to read/reply IPC messages. +There are many choices for the IPC mechanism between the syncd daemon and gbsyncd daemon. One performance requirement is that it should finish firmware download within a reasonable time. Our design choice is to use the Unix socket as the IPC mechanism. Our design has the MDIO IPC server in the syncd daemon with its own thread. A new syncd class MdioIpcServer is added to start a new thread, to create an unix socket, to listen on the socket, to accept connection and to read/reply IPC messages. -There is a corresponding MDIO access IPC client code in the form of dynamic link library which provides the flexiblity to load the library at runtime. Assuming the MDIO access library for sysfs is also in the form of dynamic library, gbsyncd can select the MDIO access library at runtime based on some configuration in the gearbox\_config.json file. +There is a corresponding MDIO access IPC client code in the form of dynamic link library which provides the flexibility to load the library at runtime. Assuming the MDIO access library for sysfs is also in the form of dynamic library, gbsyncd can select the MDIO access library at runtime based on some configuration in the gearbox\_config.json file. The same gearbox\_config.json file already has the information of the PAI library name. The information can be used to dynamically load the PAI library at runtime. @@ -113,7 +113,7 @@ The high level design of the gbsyncd mdio access function using the mdio bus fro - During the warmboot, the creation of the Unix IPC socket and connection is the same as of coldboot. - The platform module software should not reset the external PHY during warmboot. -The VendorPai class will inherit most member functions from the VendorSai class. The VendorSai class needs some changes to accomdate the inheritance. +The VendorPai class will inherit most member functions from the VendorSai class. The VendorSai class needs some changes to accommodate the inheritance. When a syncd instance runs inside the gbsyncd docker, a new command line option --paiInstance or -i with an integer argument is required. A CommandLineOptions class variable m_paiInstance stores the argument value. The syncd instance in gbsyncd docker already uses another command line option -x to point the configuration file "gearbox\_config.json". The CommandLineOptions class variable m_contextConfig stores the configuration file name. diff --git a/doc/gearbox/gearbox_mgr_design.md b/doc/gearbox/gearbox_mgr_design.md index b43281e108f..880486a874d 100644 --- a/doc/gearbox/gearbox_mgr_design.md +++ b/doc/gearbox/gearbox_mgr_design.md @@ -118,7 +118,7 @@ In order to isolate gearbox functionality and complexity, the Gearbox Manager im ![Gearbox Overview](images/gearbox_overview.png) ### 3.1.1 ORCHAGENT (modified) -Upon startup or reboot, portsyncd is started as well as the new gearsyncd deamon. The Orchagent is still responsible for creating the ASIC switch and the associated host interfaces. The internal doPortTask has been modified to support both internal port and Gearbox related events. +Upon startup or reboot, portsyncd is started as well as the new gearsyncd daemon. The Orchagent is still responsible for creating the ASIC switch and the associated host interfaces. The internal doPortTask has been modified to support both internal port and Gearbox related events. ![Gearbox ORCHAGENT FLOW](images/gearbox_orchagent_flow.png) diff --git a/doc/grpc_client/design_doc.md b/doc/grpc_client/design_doc.md index 89355e40872..9b132acc11a 100644 --- a/doc/grpc_client/design_doc.md +++ b/doc/grpc_client/design_doc.md @@ -1,549 +1,549 @@ -## gRPC client for active-actve DualToR scenario design - - -Table of Contents -================= -* [Scope](#scope) -* [Requirements](#Requirements) -* [why gRPC](#whygrpc) -* [HardWare Overview and Overall Architecture](#hardware-overview-and-overall-archtecture) - * [Hardware Overview](#hardware-overview) - * [Host Architecture](#host-architecture) - * [DualToR architecture](#dualtor-redundancy-achievment-using-active-active-solution) -* [Proto and Schema Definition](#proto-and-schema-definition) - * [Proto Definition for forwarding State](#proto-definition-interface-to-state-machine-for-getset-admin-state-of-the-fpga-ports) - * [Schema Definition for DB's ](#ycabled-functional-schema-for-data-exchanged-between-orchagent-and-linkmgr) -* [gRPC channel customisation and Telemetry Schema](#grpc-channel-customisation-and-telemetry-schema) - * [Keepalive mechanism for channels](#keepalive-for-grpc-channelstub) -* [gRPC client communicate to SoC over Loopback IP](#grpc-client-communicate-to-soc-over-loopback-ip) -* [gRPC commuication over secure channel](#grpc-commuication-over-secure-channel) -* [gRPC client initialization/deployment](#deployment) -* [gRPC commuication to NIC simulator](#grpc-communication-with-nic-simulator) - * [Interceptor Solution for NIC simulator ](#proposed-solution-using-grpc-interceptor-inside-the-client) - * [Mulptiple servers for NIC simulator ](#proposed-solution-using-multiple-grpc-servers-inside-nic-simulator) -* [Proto and Schema Definition for Aync notificaion](#proto-definition-interface-to-soc-to-notify-ycabled-about-service-notification) - - -## Revision - -| Rev | Date | Author | Change Description | -|:---:|:--------:|:---------------:|--------------------| -| 0.1 | 04/1/22 | Vaibhav Dahiya | Initial version | -| 0.2 | 02/1/22 | Vaibhav Dahiya | Make chnages to be shared to Core Team | - - -## Scope -gRPC client design doc which would communicate with the SoC in DualToR Active-active setup/ and Nic-simulator for testing infrastructure in SONiC MGMT. -### Overview - -This document summarizes the approach taken to accommodate gRPC client -for DualToR active active scenario. The gRPC client daemon's -main purpose is to provide a way for linkmgr to exchange RPC's with SoC -and do this within SONiC PMON docker - - -## Requirements - -- provide a service/daemon in SONiC to run in DualToR mode, which can interact with Platform API as well interact with state machine(aka Linkmgr) and orchagent to provide capability for it to get/set Link State/Forwarding State etc. from SoC(gRPC server listening to the client) -- the service gRPC daemon should be able exchange RPC's with the gRPC server running on the SoC over a secure channel -- provide a schema for this daemon to publish to State DB on Host which would monitor the aspects of gRPC state for all SoC's running as server. -- provide an interface/method for gRPC daemon to exchange RPC's with the gRPC server running on the SoC using a loopback IP as source IP. -- provide an interface for SoC to notify this gRPC client about going to maintainence/shutdown via an asynchronous method. -- gRPC client communication with Nic-simulator(which will be run in SONiC-Mgmt Testbeds) should also be provided to exchange RPC's. -- provide a way to monitor gRPC client's and channel health for corrective/monitoring action to be implemented within SONiC ecosystem - - -## whygRPC - -## why gRPC for communication between ToR AND the SoC - -Notes provide a helpful link for learning gRPC and main page - -- Lightweight messages. Depending on the type of call, gRPC-specific messages can be up to 30-50 percent smaller in size than JSON messages. -- High performance. By different evaluations, gRPC is 5, 7, and even 8 times faster than REST+JSON communication. -- Built-in code generation. gRPC has automated code generation in different programming languages including Java, C++, Python, Go, Dart, Objective-C, Ruby, and more. -- More connection options. While REST focuses on request-response architecture, gRPC provides support for data streaming with event-driven architectures: server-side streaming, client-side streaming, and bidirectional streaming. -- Healthy developer EcoSystem and gRPC is open source, hence it helps in getting acquainted with Libraries/API and Troubleshooting bugs/issues becomes easier - -More Resources for learning gRPC and advantages Credits - -[grpc github repo](https://github.com/grpc/grpc) - - -## Hardware Overview and overall Archtecture - -### Hardware Overview - -![Hardware Overview](images/gRPC_overall.png) - -### HOST architecture - -HOST and FPGA functionality is explained in this diagram - -![Hardware Overview](images/gRPC_host.png) - -### DualToR redundancy achievment using Active-Active solution - -![Hardware Overview](images/failover.png) - -## Proto and Schema definition - - -### Proto Definition Interface to State Machine for Get/Set Admin state of the FPGA Ports - - -the proto3 syntax's proto file used for generating gRPC code in Python3 is as follows. gRPC tools can be used to generate the corresponding library code in any language, ycabled employes Python3 to achieve this - - ```python - service DualToRActive { - rpc QueryAdminForwardingPortState(AdminRequest) returns (AdminReply) {}// queries the Admin Forwarding State of the FPGA - rpc SetAdminForwardingPortState(AdminRequest) returns (AdminReply) {}// sets the Admin Forwarding State of the FPGA - rpc QueryOperationPortState(OperationRequest) returns (OperationReply) {} // queries the Operation State of the FPGA - rpc QueryLinkState(LinkStateRequest) returns (LinkStateReply) {} // queries the Link State of the FPGA - rpc QueryServerVersion(ServerVersionRequest) returns (ServerVersionReply) {} // queries the version of the Server running - } - - message AdminRequest { - repeated int32 portid = 1; - repeated bool state = 2; - } - - message AdminReply { - repeated int32 portid = 1; - repeated bool state = 2; - } - - message OperationRequest { - repeated int32 portid = 1; - } - - message OperationReply { - repeated int32 portid = 1; - repeated bool state = 2; - } - - message LinkStateRequest { - repeated int32 portid = 1; - } - - message LinkStateReply { - repeated int32 portid = 1; - repeated bool state = 2; - } - - message ServerVersionRequest { - string version = 1; - } - - message ServerVersionReply { - string version = 1; - } - - ``` - -- The QueryAdminForwardingPortState RPC is used to query the Admin Forwarding State of the FPGA. It takes an AdminRequest message as input and returns an AdminReply message as output. - -- The SetAdminForwardingPortState RPC is used to set the Admin Forwarding State of the FPGA. It takes an AdminRequest message as input and returns an AdminReply message as output. - -- The QueryOperationPortState RPC is used to query the Operation State of the FPGA. It takes an OperationRequest message as input and returns an OperationReply message as output. - -- The QueryLinkState RPC is used to query the Link State of the FPGA. It takes a LinkStateRequest message as input and returns a LinkStateReply message as output. - -- The QueryServerVersion RPC is used to query the version of the server running. It takes a ServerVersionRequest message as input and returns a ServerVersionReply message as output. - -- The AdminRequest message contains two repeated fields, portid and state, where portid is a list of integers representing the ID of the port, and state is a list of booleans representing the state of the port. - -- The AdminReply message has the same fields as AdminRequest. - -- The OperationRequest message contains a single repeated field portid which is a list of integers representing the ID of the port. - -- The OperationReply message has two repeated fields, portid and state, where portid is a list of integers representing the ID of the port, and state is a list of booleans representing the state of the port. - -- The LinkStateRequest message has the same field as OperationRequest. - -- The LinkStateReply message has the same fields as OperationReply. - -- The ServerVersionRequest message contains a single string field version representing the version of the server. - -- The ServerVersionReply message has the same field as ServerVersionRequest. - - -## Ycabled Functional Schema for Data Exchanged between orchagent and linkmgr - -- Ycabled would exchange data/state with orchagent and linkmgr with the following schema: - - ``` - APP_DB - MUX_CABLE_TABLE| PORTNAME; written by linkmgrd react on orchagent - - state: active | standby - HW_ MUX_CABLE_TABLE | PORTNAME; written by orchagent react on ycabled (its replacement) - - state: active | standby - FORWARDING_STATE_COMMAND | PORTNAME: - - command: probe | set_active_self | set_standby_self | set_standby_peer ;written by linkmgrd react on ycabled - FORWARDING_STATE_RESPONSE | PORTNAME - - response: active | standby | unknown | error ;written by ycabled react by linkmgrd - - response_peer: active | standby | unknown | error ;written by ycabled react by linkmgrd - PORT_TABLE|PORTNAME - - oper_status: up|down; written by swss react by linkmgrd - PORT_TABLE_PEER|PORT - - oper_status: up|down; written by ycabled react by linkmgrd - HW_FORWARDING_STATE_PEER|PORTNAME; written by linkmgrd react by ycabled - - state: active|standby|unknown - MUX_SERVICE_NOTIFICATION|PORT - - notify_type:control/data - - msg_type:begin/end - - guid: - - service_time: