Correct event supported check in nsmEventConfig 1. Corrected the event supported check in nsmEventConfig to check if the configured event is supported by the device. 2. Add GPIO state change event to the event type handlers. ''' Before fix: Feb 11 07:11:25 vr-nvl-hmc nsmd[1646]: DEBUG: Event ID 3 for Message Type 0 - CONFIGURED, EID: 13 Feb 11 07:11:25 vr-nvl-hmc nsmd[1646]: DEBUG: Event ID 3 for Message Type 0 - Supported: YES, EID: 13 Feb 11 07:11:25 vr-nvl-hmc nsmd[1646]: Event ID 3 for Message Type 0 is not supported, EID: 13 After fix: No error logs. ''' Fixes nvbug https://nvbugspro.nvidia.com/bug/5877475 Signed-off-by: Utkarsh Yadav <uyadav@nvidia.com>
sudo apt install build-essential gcc-13 g++-13 python3-dev nlohmann-json3-dev pip install --user meson ninja
sudo apt install libboost1.83-all-dev # for Ubuntu 22.04
or
sudo apt install libboost1.84-all-dev # for Ubuntu 24.04
or if it not installed, download and install it from source.
wget https://downloads.sourceforge.net/project/boost/boost/1.84.0/boost_1_84_0.tar.gz tar -xzf boost_1_84_0.tar.gz cd boost_1_84_0 ./bootstrap.sh --prefix=/usr/local ./b2 install
git archive --remote=ssh://git@gitlab-master.nvidia.com:12051/dgx/bmc/libmctp.git develop libmctp-externals.h | tar -x -C common/
# Configure Meson build with debug options and compiler flags (copied from openbmc-build-scripts repo) meson setup --reconfigure -Db_sanitize=address,undefined -Db_lundef=true -Dwerror=true -Dwarning_level=3 -Db_colorout=never -Ddebug=true -Doptimization=g -Dcpp_args="-DBOOST_USE_VALGRIND -Wno-error=invalid-constexpr -Wno-invalid-constexpr -Werror=uninitialized -Wno-error=maybe-uninitialized -Werror=strict-aliasing" builddir # Build all targets ninja -C builddir
# Run all unit tests meson test -C builddir # Run specific unit test meson test -C builddir nsmChassis_test
If you encounter sdbusplus build errors, verify that the revision in subprojects/sdbusplus.wrap matches the version specified in the openbmc-build-scripts repository. Version mismatches can cause build failures.
For other subproject-related errors, you can update all subproject repositories to their latest commits using:
cd subprojects
find -L . -type d -name ".git" | while read gitdir; do
repo=$(dirname "$gitdir")
echo "Pulling updates in $repo"
cd "$repo"
git pull
cd - > /dev/null
done
# Debug all tests meson test -C builddir --gdb # Debug specific test meson test -C builddir nsmChassis_test --gdb
{ "version": "0.2.0", "configurations": [ { "name": "Debug file with Meson", "type": "cppdbg", "request": "launch", "program": "${workspaceFolder}/builddir/${relativeFileDirname}/${fileBasenameNoExtension}", "cwd": "${workspaceFolder}/builddir/${relativeFileDirname}", "preLaunchTask": "Compile meson test" } ] }
{ "version": "2.0.0", "tasks": [ { "label": "Compile meson test", "type": "shell", "command": "meson compile -C builddir ${fileBasenameNoExtension}", "group": "build", } ] }
To ensure code consistency and formatting standards in the CI pipeline, clang-format-19 needs to be installed. Follow the steps below to install clang-format-19 on your system:
# Update the package list sudo apt update # Install clang-format-19 sudo apt install clang-format-19
This will install clang-format-19 on your system, enabling it for use in the CI pipeline.
To automatically format your code before each commit, create a pre-commit hook with the following steps:
cat > .git/hooks/pre-commit << EOL
#!/bin/sh
# Get list of staged files that are C/C++ source files
files=$(git diff --cached --name-only --diff-filter=ACMR | grep ".*\.[ch]\(pp\)\?$")
if [ -n "$files" ]; then
# Format the files
clang-format-19 -i $files
# Add the formatted files back to staging
git add $files
# Check if any files were modified after formatting
if ! git diff --cached --quiet; then
echo "Formatted C/C++ files were automatically fixed up"
fi
fi
exit 0
EOL
chmod +x .git/hooks/pre-commit
The NSM daemon provides a logging framework with flood prevention capabilities to avoid overwhelming the log system with repetitive messages. This is particularly useful when errors occur repeatedly during polling operations.
The logging framework provides two main mechanisms:
LG2_LEVEL_FLT macros - PREFERRED - Convenient logging macros with built-in flood preventionshouldLog() API - Lower-level API for manual state change tracking (used by LG2_LEVEL_FLT internally)Important: Always prefer LG2_LEVEL_FLT macros over manual shouldLog() calls unless you have a specific reason to use the lower-level API.
Use LG2_LEVEL_FLT macros when:
CRITICAL: The shouldLog() API is available only in classes that inherit from NsmObject, which inherits from StateChangeLogger:
class StateChangeLogger { // Provides shouldLog() method }; class NsmObject : virtual public StateChangeLogger { // Inherits shouldLog() method }; class YourClass : public NsmObject { // Your class has access to shouldLog() via inheritance };
DO NOT use global shouldLog() functions. Always use the inherited method within your class implementation. This ensures proper state tracking per object instance.
// In a method of a class inheriting from NsmObject Task<> NsmPort::queryPortStatus() { nsm_sw_codes rc = decode_query_port_status_resp(responseMsg.get(), responseLen, &cc, &reasonCode, &dataSize, &portState, &portStatus); // PREFERRED: Use LG2_ERROR_FLT macro - automatically handles shouldLog() // The odd-indexed arguments (values in key-value pairs) are tracked for state changes LG2_ERROR_FLT( "decode_query_port_status_resp failure | reasonCode: {REASONCODE}, cc: {CC}, rc: {RC}", "REASONCODE", reasonCode, // logged and tracked "CC", cc, // logged and tracked "RC", rc // logged and tracked ); // ... rest of implementation }
How it works:
Note: When using LG2_LEVEL_FLT macros, the framework automatically extracts odd-indexed arguments (values in key-value pairs) for state change tracking. Only the following types are supported for state tracking:
boolnsm_reason_codesnsm_sw_codesnsm_completion_codesOther types (like int, uint8_t, uint16_t, std::string) can be used in log messages but will NOT be tracked for state changes.
// In a method of NsmEventConfig class bool NsmEventConfig::validateEventIds() { bool isNotSupported = !isEventIdSupported(eventId); // Create a descriptive logger message for state tracking const auto loggerMsg = std::format("Validation of Event ID {} for Message Type {}", eventId, messageType); // Use LG2_ERROR_FLT - only 'isNotSupported' (bool) is tracked for state changes // Note: eventId, messageType, eid are NOT tracked (not supported types) LG2_ERROR_FLT( "Event ID {ID} for Message Type {MSG_TYPE} is not supported, EID: {EID}", "ID", eventId, // logged but NOT tracked (int type) "MSG_TYPE", messageType, // logged but NOT tracked (enum/int type) "EID", eid, // logged but NOT tracked (eid_t type) "NOTSUPPORTED", isNotSupported // logged AND tracked (bool type) ); }
Important: Only values of supported types (bool, nsm_reason_codes, nsm_sw_codes, nsm_completion_codes) are tracked for state changes. Other argument values are logged but ignored for flood prevention logic.
Only use manual shouldLog() when you need explicit control over when to log:
// In a method of a class inheriting from NsmObject Task<> NsmPort::queryPortStatus() { nsm_sw_codes rc = decode_query_port_status_resp(responseMsg.get(), responseLen, &cc, &reasonCode, &dataSize, &portState, &portStatus); // Manual check with shouldLog() - only use if you need conditional logging logic if (shouldLog("decode_query_port_status_resp", reasonCode, cc, rc)) { LG2_ERROR( "decode_query_port_status_resp failure | reasonCode: {REASONCODE}, cc: {CC}, rc: {RC}", "REASONCODE", reasonCode, "CC", cc, "RC", rc); } }
Important: shouldLog() is a method inherited from StateChangeLogger via NsmObject. It must be called within a class method, not as a global function.
CRITICAL: When you need to log human-readable translations of error codes using utility functions like utils::nsmSwCodeToString(), you MUST use the manual if(shouldLog()) { LG2_LEVEL() } pattern. You CANNOT use LG2_LEVEL_FLT in this case.
Why? The shouldLog() function only accepts specific types (bool, nsm_reason_codes, nsm_sw_codes, nsm_completion_codes) and does NOT accept std::string or string conversions.
// CORRECT: Manual shouldLog() with LG2_ERROR for human-readable output Task<> NsmDevice::processResponse() { nsm_sw_codes rc = performOperation(); // Check state with raw enum values (supported types) if (shouldLog("processResponse", rc)) { // Now we can use string conversion utilities in the log message LG2_ERROR( "processResponse failure | rc: {RC}, readable: {RC_STR}", "RC", rc, "RC_STR", utils::nsmSwCodeToString(rc) // String translation ); } }
// WRONG: This will NOT compile - LG2_ERROR_FLT doesn't accept string conversions Task<> NsmDevice::processResponse() { nsm_sw_codes rc = performOperation(); // ❌ ERROR: This won't work - utils::nsmSwCodeToString() returns std::string LG2_ERROR_FLT( "processResponse failure | rc: {RC}, readable: {RC_STR}", "RC", rc, "RC_STR", utils::nsmSwCodeToString(rc) // ❌ String not supported! ); }
Summary: Use if(shouldLog()) { LG2_LEVEL() } when you need string translations or other non-supported types in your log messages.
The framework provides several macro levels:
LG2_<LEVEL>_FLT - PREFERRED - Logging with flood prevention
shouldLog() with odd-indexed argumentsNsmObjectutils::nsmSwCodeToString())if(shouldLog()) { LG2_<LEVEL>() } - Manual flood prevention with flexibility
utils::nsmSwCodeToString()LG2_<LEVEL>_FLTLG2_<LEVEL> - Standard logging without flood prevention
NsmObjectAvailable levels: EMERGENCY, ALERT, CRITICAL, ERROR, WARNING, NOTICE, INFO, DEBUG
Recommendation: Always use LG2_<LEVEL>_FLT macros by default. Use the manual if(shouldLog()) pattern only when you need string translations or other non-supported types in your log messages.
When all tracked arguments return to success state, the framework automatically logs a success message:
{FUNCNAME} SUCCESSFUL | Cleared Codes : ReasonCodes=[ERR_TIMEOUT, ERR_NOT_SUPPORTED], ResultCodes=[NSM_SW_ERROR]
This helps you track when issues are resolved without manual success logging.
LG2_<LEVEL>_FLT macros - Always use these by default instead of manual shouldLog() callsNsmObject derived classes - Never use shouldLog() as a global function; it's a method inherited from StateChangeLogger via NsmObjectutils::nsmSwCodeToString()), you MUST use if(shouldLog()) { LG2_LEVEL() } pattern instead of LG2_LEVEL_FLTLG2_LEVEL_FLT calls for the same log messageshouldLog API) for throttled logsBitfield256 for enum types to track multiple error codesThe NSM daemon tracks various sensor polling operations using progress counters. These counters are stored in a memory-mapped file descriptor (memfd) and can be accessed via D-Bus for duming, monitoring and debugging purposes.
Each counter type tracks a specific aspect of sensor polling operations:
sensorManager.cpp::pollPrioritySensors()nsmGpmOemFactory.cpp when creating GPM sensorssensorManager.cpp::updateLongRunningSensor()sensorManager.cpp::pollNonPrioritySensors() when pollingType == StaticsensorManager.cpp::pollNonPrioritySensors() when pollingType == RoundRobinSENSOR_POLLING_TIME (typically 150ms)sensorManager.cpp::pollPrioritySensors() when (t1 - t0) > pollingTimeInUsecnsmDevice.cpp::postPatchIO()nsmEvent.cpp::DelegatingEventHandler::delegate()NSM_SUCCESS or NSM_SW_ERROR_TIMEOUTprogressCounters.cpp::increment() when rc != NSM_SUCCESS and rc != NSM_SW_ERROR_TIMEOUTNSM_SW_ERROR_TIMEOUT)progressCounters.cpp::increment() when rc == NSM_SW_ERROR_TIMEOUTProgress counters can be configured via meson options:
progressCounter: Enable/disable progress counter functionality (default: enabled)sensor-progress-counters-dump-count-threshold: Number of counter updates before dumping to memfd (default: 100000)sensor-progress-counters-dump-time-threshold: Time threshold in microseconds before dumping (default: 600000000 = 10 minutes)sensor-progress-counters-memfd-size: Size of the memory-mapped file in bytes (default: 65536)Counter data is exposed via D-Bus at:
/xyz/openbmc_project/progress_counters/<device_eid>
Use the nsmProgressCountersReader tool to read counter data:
# Read counters for all devices nsmProgressCountersReader # Read counters for specific device nsmProgressCountersReader <device_eid>
To add a new progress counter type, follow these steps:
Add your new counter type to nsmd/nsmProgressCounters/progressCounterType.hpp:
enum class ProgressCounterType { Priority, GpuPerformanceMonitoring, // ... existing counters ... YourNewCounter, // Add here, before EnumCount (must be last) EnumCount, };
Important: Always add new counters before EnumCount, as EnumCount must remain the last entry for the PollingCountersSize calculation.
Add comprehensive documentation for your new counter in the nsmd/nsmProgressCounters/progressCounterType.hpp file:
/** * @brief Your new counter description * * Incremented when: Describe when this counter is incremented * * Location: File.cpp::functionName() */ YourNewCounter,
Add your new counter to the “Counter Types and When They Are Incremented” section in this README with:
Add your counter name to the return vector in nsmd/nsmProgressCounters/progressCounters.cpp:
ProgressCounters::ProgressCounters(eid_t eid) : PollingCountersBase( SENSOR_PROGRESS_COUNTERS_DUMP_COUNT_THRESHOLD, SENSOR_PROGRESS_COUNTERS_DUMP_TIME_THRESHOLD, progressCountersObjectBasePath / "polling" / std::to_string(eid), "Polling Progress Counters for device EID=" + std::to_string(eid), { "Priority", "GPM", "LongRunning", "Static", "RoundRobin", "PriorityTimeExceeded", "PostPatch", "Event", "Error", "Timeout", "YourNewCounter", // Add your counter name here }) {}
Important: The order must match the enum order in ProgressCounterType. This vector is exposed via D-Bus as the CountersHeaders property and is used by nsmProgressCountersReader to display counter names in CSV output.
In the appropriate location in your code, increment the counter:
// For successful operations nsmDevice->progressCounters().increment(ProgressCounterType::YourNewCounter, rc);
Counters are stored in a packed structure for efficient memory usage:
template <std::size_t Size> struct __attribute__((packed)) CountersDataRow { uint32_t key; // Iteration/dump key uint64_t timestamp; // Timestamp in microseconds CountersArray<Size> counters; // Array of counter values };
The data rotates in the memfd using key % maxRows to ensure bounded memory usage.
The NSM daemon tracks device discovery operations using discovery event counters. These counters are stored in a memory-mapped file descriptor (memfd) and can be accessed via D-Bus for monitoring and debugging the device discovery process.
Discovery event counters track the state of device discovery operations. Unlike polling counters which increment continuously, discovery event counters track the state or result of specific discovery operations. Each counter is initialized to -1 (not executed/not triggered) and is updated as the discovery process progresses.
The following discovery events are maintained for each NSM device:
-1: Not triggered0+: Count of interface added signals received-1: Not triggered0+: Count of interface removed signals received-1: Not set0: Not available1: Available-1: Not executedRC: NSM return code (0 = success, non-zero = error code)-1: Not executedRC: NSM return code (0 = success, non-zero = error code)-1: Not executedRC: NSM return code (0 = success, non-zero = error code)-1: Not attempted0: Failed1: Success-1: Not executedRC: NSM return code (0 = success, non-zero = error code)-1: Not executedRC: NSM return code (0 = success, non-zero = error code)-1: Not executedRC: NSM return code (0 = success, non-zero = error code)-1: Not executedRC: NSM return code (0 = success, non-zero = error code)-1: Not attempted0: Failed1: SuccessDiscovery event counters share the progress counter configuration:
progressCounter: Enable/disable progress counter functionality (default: enabled)discovery-progress-counters-memfd-size: Size of the memory-mapped file in bytes (default: 8192)Discovery event counters are automatically dumped when any counter value changes after being set, ensuring each discovery operation snapshot is captured.
Discovery event data is exposed via D-Bus at:
/xyz/openbmc_project/progress_counters/discovery/<device_eid>
Use the nsmProgressCountersReader tool to read discovery event data:
# Read discovery events for all devices nsmProgressCountersReader # Read discovery events for specific device nsmProgressCountersReader <device_eid>
Discovery event counters use signed 8-bit integers (int8_t) to represent three distinct states:
-1): The operation has not been performed yet0 or error code): The operation failed or returned an error1 or 0 for success): The operation completed successfullyThis three-state model allows distinguishing between operations that haven't run yet versus operations that ran but failed.
For NSM_Temp sensors, SensorId = 0 is considered the primary temperature sensor.
As per NSM Spec (Get Temperature Reading command):
SensorId 0 = Device temperature averaged across all device sensors
Successful build should generate three binary artifacts.
A Daemon that can discover NSM endpoint, gather telemetry data from the endpoints, and can publish them to D-Bus or similar IPC services, for consumer services like bmcweb.
nsmtool is a client tool that acts as a NSM requester which can be invoked from the BMC. nsmtool sends the request message and parse the response message & display it in readable format.
A mockup NSM responder that can be used for development purpose. Its primary usage is to test nsmd and nsmtool features on an emulator like QEMU.
Follow this steps to run nsmMockupResponder: Step 1 - On the QEMU instance, restart the nsmd service.
Step 2 Assign an address to the loopback (lo) interface $ mctp addr add 12 dev lo
Step 3 Immediately start the mock responder using the assigned address $ nsmMockupResponder -v -d Baseboard -i 0 -e 12
Run Step 3 right after Step 2. If there is any delay, nsmd will fail to detect the endpoint. If detection fails, repeat all steps from the beginning.