5G FaultManagement_Process by L2 and Wow Training.pptx
AmanMomin9
11 views
24 slides
Sep 13, 2024
Slide 1 of 24
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
About This Presentation
Explain about 5G Fault Management Process
Size: 13.06 MB
Language: en
Added: Sep 13, 2024
Slides: 24 pages
Slide Content
<Document ID: change ID in footer or remove> <Change information classification in footer> 5G L2 F a ult M a n a gement Process and WoW Training – BLR L2 Team Vijayakumar Shezhiyan 28/01/2021
<Document ID: change ID in footer or remove> <Change information classification in footer> Agenda & Topics : 1) 5G SW Faults WoW, Process know how. 2) Pronto DoD know how. 3) Bookmark links while working in Pronto faults.
<Document ID: change ID in footer or remove> <Change information classification in footer> Nokia BTS Fault Management Process - > Dev, ET pov Fault management process defines how product defects are managed for 2G, 3G, 4G and 5G products. Fault Management relies on the Pronto tool for managing faults found in Nokia products. Pronto tool used for for both R&D internal faults or faults raised by customer and can be used for handling Testing Tools and Documentation type faults as well.
<Document ID: change ID in footer or remove> <Change information classification in footer> Overview of Fault Management and its interfaces
<Document ID: change ID in footer or remove> <Change information classification in footer> Pronto Access – Production server & QA server - Login to Pronto 4 is made with Nokia intranet username & password. Single Sign On (SSO) is implemented. https://pronto.int.net.nokia.com/pronto/ -> Live Production
<Document ID: change ID in footer or remove> <Change information classification in footer> General rules – Rules know how while dealing Pronto’s These rules should be familiar to all the tribe members It is not a good practice to create one big pronto describing the multiple problems that might not be linked to each other. If the Problem description is not complete or the mandatory logs are missing, the problem report must be transferred back to the author for update and [#BAD_ANALYSIS] tag must to be used. Before creating a new Problem Report make sure you have checked if there is no other, open Pronto reported to similar issue , refer to How to avoid pronto duplicates. It is not allowed to change the subject of pronto after it is reported . Pronto description should be relevant to original symptoms not new/different symptoms faced during further problem re-tests or test case re-runs.
<Document ID: change ID in footer or remove> <Change information classification in footer> General rules – Rules know how while dealing Pronto’s New Fault Report( PR – Problem Report) can be raised against a build that is not older than 2 regular (Monday - Friday) working days (QT release date) and there are the later builds already released (or released with restrictions). Exceptions to above rules is likes of Performance, Stability, Field Verification, Customer Problem Reports, etc.. And In case above rule is not followed, pronto can be transferred back to author’s group immediately with request for reproduction on valid R&D build Pronto for official release can NOT be reported for, a) Issue which is not described as a requirement in an official documentation b) Not officially Released Build (only QT Team can create PR for Build Released for QT)
<Document ID: change ID in footer or remove> <Change information classification in footer> Pronto Creator Interface - Harmonized FM Tool for Pronto Creation Reporting Portal (nsn-rdnet.net) To improve Pronto handling efficiency, to start usage of Machine Learning algorithms and to improve Pronto user experience. PCI is mandatory to be used before each Pronto Report creation in Pronto Tool starting with 15.06.2020. PCI - > front-end based on the Reporting Portal ( ReP ) engine. Golden Standard Validator (GSV) to validate if Description field follows rules in Harmonized Fault Management process SIMSBot to check if the same Pronto exists already in Pronto Tool to minimize risk of duplicated Pronto Reports. MLoGIC to support a first Pronto Group assignment (a proper Group in Charge) to minimize risk of not correct Pronto transfers between Author and SW Teams.
<Document ID: change ID in footer or remove> <Change information classification in footer> Create new Problem Report (PR) - Person who discovers the problem (I&V) Click Report New Problem link Fill in necessary fields Mandatory fields for BTS are underlined in this guide Mandatory fields in Pronto tool are marked by * ----- Identification ------------------------------------------------------------------------- Title : Short explanation of fault Software/Hardware : What Software or Hardware Release/Build was in use when fault was found. Use this button: “>” to search. Release : Field is automatically filled when Build is selected Product : Field is automatically filled when Build is selected. If Release and Build relates to several product then Product must be chosen from list of Product values. Delivery Package : Not used internal PR’s in BTS Products Feature : Feature which was tested when the Fault was found. Feature can be searched and selected after technical information (Product, Release and Build) is filled in. Notice: If technical information (Product, Release and Build) is edited in any time before or after Problem Report creation Feature field value will be validated again and may need to be selected again. Description : Extensive explanation of fault. Notice: If Product or Release specific customized Description template is activated it appears automatically to description field available for user to fill in. Attachments : If needed or available (Maximum file size 100MB) Reported By : Collaborator, Customer, Nokia (default), Vendor Author : Author Name and Author Group Group in Charge : Group having the best knowledge of the issue. Select responsible group carefully. If responsible group isn’t clear, consult with CT FC of a group and select the group accordingly. Development Fault Coordinator: Name of the Fault Coordinator is filled in automatically when document is saved. Name of the Development Fault Coordinator is defined in Pronto Group parameter Other Interested : Person(s) interested in PR follow-up (optional). If you add user(s) to Other Interested field they will get Pronto Mails when case proceeds. Like state changes for example. continues …
<Document ID: change ID in footer or remove> <Change information classification in footer> Create new Problem Report (PR) - Person who discovers the problem (I&V) ----- Specification -------------------------------------------------------------- Problem Type : Software/Hardware/Documentation/ Testware /Change Request, select value from the list. Usage of value Change Request explained more detailed in Change Note vs. Problem Report Security Fault : This field is used by Product Security Organizations. Severity : A, B or C (See Problem Severity Classification). R&D Priority : Selectable values in list. See Usage of R&D Priority section for further information for R&D Priority field usage. Top Importance : Priority of Release for the correction. Information transferred from Electra in Customer Problem Reports. For internal Problem Reports field is used by FMM. See FCB/and FMM minutes or MN FM Process for detailed instructions and latest “TOP Labels”. Discovered in : Choose the test level where the problem was found. Test Subphase : Test Subphase values explained in MN FM Process. Link visible in Further Information page. Number of Fault Occurrence / Affected live sites : Internal Reports: How many times the problem was reproduced? Customer Reports: How many sites in live operations were affected by the problem? Documentation type report: How many times the reported issue is defined in a documentation improperly. Total number of Test execution / Identical sites (the same SW, HW and configuration : Internal Reports: How many times Test Scenario was run? Customer Reports: How many sites in live operations are running with the same SW, HW and configuration? Documentation type report: Default value is 1 (means the documentation has been checked once). Repeatability : This is a dedicated field for a customer fault report from CS (Case Handling) Tool. How often the problem appeared. Values: Occasional, One Occurrence, Permanent Customer Impact Analysis : Author’s technical evaluation of an initial customer impact analysis of the fault from system level point of view. Select a proper value. Fault Summary for Customer Communication : An author detailed description of problem for a customer communication. Analysis has to contain: 1. Impact on an operator 2. Impact on an end user: 3. Impacted HW System Module: e.g. ‘FSMF.101’. 4. Impacted HW Radio Module: e.g. ‘FRHA.101’ Recovery Action : Author’s technical evaluation of how system will recover from the error situation.
<Document ID: change ID in footer or remove> <Change information classification in footer> Harmonized Pronto Title The purpose of this instruction is to unify the tags used in Pronto naming with potential benefit to speed up similar Pronto search and lower the attach rate. It also has a big potential of co-work with machine learning tools like PCI. Tags indicating phase in which fault was found - examples: [PIT], [CIT], [STABI], [PET], [Feature ID] [KPI], [ST], [ FiVe ], [IODT], [L3_CAMP], etc.. Tag indicating system module platform - 5G AiC 5G Radio , AIRSCALE, FSMF, FSIH, FZM,etc .. Tag indicating huge features - [FHS], [MMIMO], [ULCOMP], [ENDC] Dedicated tag for suspected hardware (system module or RF module) as source of failure - [ABIC], [AAFIA], [ANEGA]
<Document ID: change ID in footer or remove> <Change information classification in footer> Harmonized Pronto Title Dedicated tag if emulator is suspected - [IPHY], [RTG], [AP], [L1BP], [ENBSIM], [GNBSIM], [EPCSIM], [APSIM], [EMSSIM] , [VIAVI], [KEYSIGHT] Free Text Area - Main execution scenario + Alarm description (if exists) Main execution scenario: e.g. Startup , Commissioning, Reset, Block, Unblock, Configuration Reset. Alarm description (if exists): e.g. Autonomous reset, Failure in optical interface, SW Fallback, RF Module Failure. Good example of Pronto subject including tags + free text area : [QT][FDD][AIRSCALE+AIRSCALE][AHFB][FID:1868] Not enough HW for LCR after site reset [PET][TDD][AIRSCALE][ABIA][FID:6709] Failure in replaceable baseband unit after commissioning [FDD][FSMF][FXED][FID:6253] Cell configuration data distribution failed after BTS unblock
<Document ID: change ID in footer or remove> <Change information classification in footer> PR Golden Standard Template – PR Description Field
<Document ID: change ID in footer or remove> <Change information classification in footer> PR Golden Standard Template – PR Description Field Rule 1: Detailed Test Steps Rule 2: Expected Results Rule 3: Actual Results Rule 4: Tester Analysis Rule 5: Attached Logs Content Rule 6: Test-Line Reference/used HW/configuration/tools/SW version Rule 7: Used Flags (list here used R&D flags) Rule 8. Test Scenario History of Execution Rule 9: Testcase (QC, RP link or UTE link)
<Document ID: change ID in footer or remove> <Change information classification in footer> Severity and R&D Priority
<Document ID: change ID in footer or remove> <Change information classification in footer> Severity and R&D Priority
<Document ID: change ID in footer or remove> <Change information classification in footer> Pronto’s during FOT feature development – PRs WoW during FOT development life cycle - If Feature is DONE Pronto Report is raised normally according to FM process. PRs can be raised during feature is still being developed under FOT, provided these conditions met a) Pronto Report shall be raised if an Entity Item or System Test Item planned for the current FB can not be set to “Done” (based on MN DoD). It is expected that at least one pronto is assigned to failing TC's of this Item. b) Inside FOT/FT: Correction of a problem is expected from team/SC contributing to a FOT/FT If correction was not delivered, latest after 3 working days: - FOT Leader has to always decides if creation of Fault Report make sense and speed up investigation of a feature completion blocker.
<Document ID: change ID in footer or remove> <Change information classification in footer> How to write better Fault Analysis Different Parts of a Fault Analysis Technical Analysis Identification Impact to End-User Impact to Operator Resolution Feature Etc. Quality Analysis Should Have been Found In Root Cause Defective Software Fix Name of DFc
<Document ID: change ID in footer or remove> <Change information classification in footer> Technical Analysis Identification Short and understandable summary of the problem When and why problem occurs ? Only PR title is NOT enough! Pronto title can be used as reference What was customer or tester trying to do? What was the problem from the customer's or tester’s viewpoint? What was wrong or not working? Fault needs to be described from operator point of view, NOT ONLY for internal technical expert Platform info in which the problem exists: (Flexi Rel1/2/3/4, Ultra) Note: Observed phenomena statement, Not root cause
<Document ID: change ID in footer or remove> <Change information classification in footer> Technical Analysis Identification How end user/operator could detect the problem How the issue could be reproduced and identified by end user/operator Description of the fault What has caused the fault? Any additional relevant facts to note? Dependency on configuration Faulty component and version Faulty component first delivered in (e.g. release, CD) (mandatory) When the issue was firstly introduced into the system? eNB build Number By which feature, CR or Pronto introduced this issue? (Feature ID, CR ID, Pronto ID or Legacy Code
<Document ID: change ID in footer or remove> <Change information classification in footer> Technical Analysis Resolution Workaround Is there a workaround available for Customer? If Workaround exists, this describes the method to avoid the problem or minimize the defect impact before the actual correction is available. If not use “No Workaround”. Description of the correction (incl. risk analysis) What was the solution and change
<Document ID: change ID in footer or remove> <Change information classification in footer> Technical Analysis Resolution Analysis of the correction risk to the customer / end-user Any impact and risk for the code change for the interfaces or customer Correction effects I nterface effects If correction impacts external interface of product and on standard compliancy Short and understandable explanation of the correction impact e.g : What interfaces are impacted? How interfaces are impacted? What is the interface change comparing to previous implementation?
<Document ID: change ID in footer or remove> <Change information classification in footer> Fault Correction Funnel – With COOP - We must control the quality of our SW promotions to Trunk automatically. For new Feature development we have “Feature DoD score”, but 20-30% of SW promotions to Trunk are related to fault corrections and we do not have similar control / visibility for those in place. Fault Correction Funnel (FCF) (nokia.com) – Continuous Delivery Visualization
<Document ID: change ID in footer or remove> <Change information classification in footer> Bookmark links PR Tool Production Server – PR Tool Server – Hrmonied Fult mngement Process- PCI web link- WFT - https://wft.int.net.nokia.com/ Reporting Portal :: Reports - QC Test instances (nsn-rdnet.net) Fault Correction Funnel (FCF) (nokia.com)