- Add README about test data.
- Add script to redact sensitive fields.
- Add JSON testing data collected from many systems, with redaction of
sensitive fields.
The initial corpus includes:
- NVME drives
- SAS drives - HDD only, no SSD
- SCSI drives - HDD only, no SSD
- SATA drives - SSD & HDD
Signed-off-by: Robin H. Johnson <robbat2@gentoo.org>
Fix the following metrics that were exported as zero because the
exporter did not know how to read them for SCSI devices:
- smartctl_device_bytes_read
- smartctl_device_bytes_written
- smartctl_device_power_cycle_count
New metrics:
- smartctl_read_errors_corrected_by_eccdelayed
- smartctl_read_errors_corrected_by_eccfast
- smartctl_write_errors_corrected_by_eccdelayed
- smartctl_write_errors_corrected_by_eccfast
Fix labels:
- smartctl_device{model_name} is now populated for SCSI/SAS, using
scsi_model_name.
New labels:
- smartctl_device{} gains:
scsi_product,scsi_revision,scsi_vendor,scsi_version
Signed-off-by: Robin H. Johnson <rjohnson@coreweave.com>
The exporter presently has metrics that are nonsense for a given type of
drive, and remain at zero due to their defaults.
Change the behavior to NOT emit a metric if the underlying JSON field is
not present.
Future related work may include parsing the corresponding metrics for
SATA/SAS SSDs (e.g. `smartctl_device_percentage_used` could derived from
`SSD_Life_Left` on some drives).
Metrics no longer exported for the wrong type of drive:
- `smartctl_device_nvme_capacity_bytes` (NVME-specific)
- `smartctl_device_available_spare` (NVME-specific, ATA possible)
- `smartctl_device_available_spare_threshold` (NVME-specific, ATA
possible)
- `smartctl_device_critical_warning` (NVME-specific, ATA possible)
- `smartctl_device_interface_speed` (ATA-specific)
- `smartctl_device_media_errors` (NVME-specific, ATA possible)
- `smartctl_device_num_err_log_entries` (NVME-specific, SCSI uses
distinct metrics, ATA possible)
- `smartctl_device_nvme_capacity_bytes` (NVME-specific)
- `smartctl_device_percentage_used` (NVME-specific, ATA possible)
Signed-off-by: Robin H. Johnson <rjohnson@coreweave.com>
* remove redundant meta labels from SCSI metrics
* added `smartctl_device_nvme_capacity_bytes` metric
* for some devices, such as 2.5" NVMe Intel & Micron the `family` field may be empty
The `.user_capacity` exists only when NVMe have single namespace. Otherwise,
for NVMe deivces with multiple namespaces, when device name used witout
namespace number (exporter case) `.user_capacity` will be absent
```
smartctl --info --health --attributes \
--tolerance=verypermissive --nocheck=standby --format=brief --log=error \
/dev/nvme11 --json | jq '.user_capacity'
null
smartctl --info --health --attributes \
--tolerance=verypermissive --nocheck=standby --format=brief --log=error \
/dev/nvme11 --json | jq '.nvme_total_capacity'
3840755982336
```
Signed-off-by: Konstantin Shalygin <k0ste@k0ste.ru>
Prometheus naming conventions reserve `_count` for the counter in
histograms. For gauge values the naming convention is to use the plural
of the thing being counted.
Signed-off-by: SuperQ <superq@gmail.com>