The bulk of this change set was automated by the following script which
is being used to aid in converting the various exporters/projects to use
slog:
https://gist.github.com/tjhop/49f96fb7ebbe55b12deee0b0312d8434
Other changes include:
- bumping prometheus/{common,client_golang,exporter-toolkit}
- bump minimum go version to go1.22
- remove old go-kit/log linter configs, add sloglint
Signed-off-by: TJ Hoplock <t.hoplock@gmail.com>
The NVMe specification says that the controller is responsible for
reporting "Data Units Read" & "Data Units Written" converted as needed
for logicial block sizes other than 512-bytes. smartmontools already has
the correct behavior.
What is correct in this case? For now, track what smartmontools does:
take the counter, multiply by 512*1000, report the value.
We should be clear that it means the drive has read/written at most
that many bytes.
This has a few impacts:
- NVME devices will now show these metrics, if they did not before.
- NVME devices with blocksize other than 512-bytes may have previously
reported inflated metrics, but are now corrected (is this worthy of
larger notice in changelogs?)
Reference: 11415ee0b9/smartmontools/nvmeprint.cpp (L394-L397)
Closes: https://github.com/prometheus-community/smartctl_exporter/issues/122
Signed-off-by: Robin H. Johnson <rjohnson@coreweave.com>
Fix the following metrics that were exported as zero because the
exporter did not know how to read them for SCSI devices:
- smartctl_device_bytes_read
- smartctl_device_bytes_written
- smartctl_device_power_cycle_count
New metrics:
- smartctl_read_errors_corrected_by_eccdelayed
- smartctl_read_errors_corrected_by_eccfast
- smartctl_write_errors_corrected_by_eccdelayed
- smartctl_write_errors_corrected_by_eccfast
Fix labels:
- smartctl_device{model_name} is now populated for SCSI/SAS, using
scsi_model_name.
New labels:
- smartctl_device{} gains:
scsi_product,scsi_revision,scsi_vendor,scsi_version
Signed-off-by: Robin H. Johnson <rjohnson@coreweave.com>
The exporter presently has metrics that are nonsense for a given type of
drive, and remain at zero due to their defaults.
Change the behavior to NOT emit a metric if the underlying JSON field is
not present.
Future related work may include parsing the corresponding metrics for
SATA/SAS SSDs (e.g. `smartctl_device_percentage_used` could derived from
`SSD_Life_Left` on some drives).
Metrics no longer exported for the wrong type of drive:
- `smartctl_device_nvme_capacity_bytes` (NVME-specific)
- `smartctl_device_available_spare` (NVME-specific, ATA possible)
- `smartctl_device_available_spare_threshold` (NVME-specific, ATA
possible)
- `smartctl_device_critical_warning` (NVME-specific, ATA possible)
- `smartctl_device_interface_speed` (ATA-specific)
- `smartctl_device_media_errors` (NVME-specific, ATA possible)
- `smartctl_device_num_err_log_entries` (NVME-specific, SCSI uses
distinct metrics, ATA possible)
- `smartctl_device_nvme_capacity_bytes` (NVME-specific)
- `smartctl_device_percentage_used` (NVME-specific, ATA possible)
Signed-off-by: Robin H. Johnson <rjohnson@coreweave.com>
* remove redundant meta labels from SCSI metrics
* added `smartctl_device_nvme_capacity_bytes` metric
* for some devices, such as 2.5" NVMe Intel & Micron the `family` field may be empty
The `.user_capacity` exists only when NVMe have single namespace. Otherwise,
for NVMe deivces with multiple namespaces, when device name used witout
namespace number (exporter case) `.user_capacity` will be absent
```
smartctl --info --health --attributes \
--tolerance=verypermissive --nocheck=standby --format=brief --log=error \
/dev/nvme11 --json | jq '.user_capacity'
null
smartctl --info --health --attributes \
--tolerance=verypermissive --nocheck=standby --format=brief --log=error \
/dev/nvme11 --json | jq '.nvme_total_capacity'
3840755982336
```
Signed-off-by: Konstantin Shalygin <k0ste@k0ste.ru>
This value is reported in thousands (i.e., a value of 1 corresponds to 1000 units of 512 bytes written) and is rounded up.
When the LBA size is a value other than 512 bytes, the controller shall convert the amount of data written to 512 byte units.
Current code is using 1024 instead of 1000.
Signed-off-by: tekert <tekert@gmail.com>
The requirement from field engineers is knowledge of the form factor of the device, i.e. 3.5", 2.5"
* updated EXAMPLE.md
* fixed copy-paste issue `Starting systemd_exporter`
Signed-off-by: Konstantin Shalygin <k0ste@k0ste.ru>
Switch exporter over to standard Prometheus exporter flags and logging.
This eliminates the need for a configuraion file.
Signed-off-by: SuperQ <superq@gmail.com>