Introduction

Welcome to the ALUMET user guide! If you want to measure something with ALUMET, you have come to the right place.

To skip the introduction and install ALUMET, click here.

What is ALUMET?

ALUMET is a modular and efficient software measurement tool. With ALUMET, you can:

  • measure the energy consumption of your CPU, GPU, and more
  • assign the energy consumption of hardware resources to their consumers (such as processes, K8S pods, containers, etc.)
  • gather performance metrics at a configurable frequency
  • monitor laptops, desktops and HPC servers
  • profile your applications

ALUMET (sometimes written "Alumet") is acronym for Adaptive, Lightweight, Unified METrics.

High-level architecture

Diagram of Alumet (high-level view)

The measurement sources (in yellow), data transforms (in blue) and outputs (in green) are provided by plugins, not by Alumet's core. Plugins are developed as separate software libraries. This is an improvement over monolithic tools, because it allows developers to easily extend the capabilities of the tool, in every part of the measurement pipeline (sources -> transforms -> outputs).

We offer many "standard" plugins, but you are free to create your own, for instance if you want to gather metrics from a piece of hardware that we do not support. Please read the developer guide to learn more about the creation of plugins.

Performance

The L in Alumet stands for Lightweight. Why is Alumet "lightweight" compared to other measurement tools?

  1. Optimized pipeline: Alumet is written in Rust, optimized for minimal latency and low memory consumption.
  2. Efficient interfaces: When we develop a new measurement source, we try to find the most efficient way of measuring what we're interested in. As a result, many plugins are based on low-level interfaces, such as the Linux perf_events interface, instead of slower higher-level wrappers. In particular, we try to remove useless intermediate levels, such as calling an external program and parsing its text output.
  3. Pay only for what you need: Alumet's modularity allows you to create a bespoke measurement tool by choosing the plugins that suit your needs, and removing the rest. You don't need a mathematical model that assigns the energy consumption of hardware components to processes? Remove it, and enjoy an even smaller disk footprint, CPU overhead, memory use and energy consumption.

Read more about the advantages of Alumet on the next page: Why ALUMET and not <X>?.

Does it work on <my_machine>?

For now, Alumet works in the following environments:

  • Operating Systems: Linux, macOS1, Windows1
  • Hardware components2:
    • CPUs: Intel x86 processors (Sandy Bridge or more recent), AMD x86 processors (Zen 1 or more recent), NVIDIA Jetson CPUs (any model)
    • GPUs: NVIDIA dedicated GPUs, NVIDIA Jetson GPUs (any model)
1

While the core of Alumet is cross-platform, many plugins only work on Linux, for example the RAPL and perf plugins. There is no macOS-specific nor Windows-specific plugin for the moment, so Alumet will not be able to measure interesting metrics on these systems.

2

If your computer contains both supported and unsupported components, you can still use Alumet (with the plugins corresponding to the supported components). It will simply not measure the unsupported components.

Why ALUMET and not <X>?

Every tool comes with its limitations, and Alumet is not the only measurement software out there. Here is why we think that Alumet may be better than other existing tools.

Generic and Unified

You can plug many different sources, outputs, and even transformation functions into Alumet, without modifying its core. Instead of having one specialized tool for the CPU, another one for the GPU, and another one for Kubernetes pods, each with a different interface, polling frequency and methodology, you can have one instance Alumet.

With Alumet, the tedious work that was previously duplicated is now factorized in a common tool. Tedious work for the administrator: configuring the gathering of the metrics, giving the proper rights to each tool, saving the results to a database, etc. But also tedious work for the developer: supporting a new database, writing the configuration management code, optimizing each tool, etc.

Extensible

Alumet is made of a core, on top of which we add plugins. You choose the plugins that suit your needs and build a measurement application with them. You can also create new plugins with an easy-to-use, high-level API.

Some existing tools claim that they have a plugin interface and a modular code. However, their modularity is often limited to a few functions or abstract interfaces in a largely monolithic codebase. For example, supporting a new source of measurements often require to modify the core of the tool. In contrast, Alumet offers modularity in a way that is both easy to use and powerful. The first evidence of this advantage is that the core and the plugins are in distinct crates, rather than in a monolithic codebase. A second evidence is that Alumet plugins have less restrictions than Telegraf plugins. For instance, the same Alumet plugin can provide sources, transforms and outputs. A plugin can also modify the pipeline's configuration at runtime, without prior knowledge of the other plugins.

Lightweight and fast

Alumet is written in Rust and optimized for minimal latency and low memory consumption. Furthermore, many plugins are based on low-level sensors like the perf_events interface of the Linux kernel. Finally, the plugin system allows you to only include the plugins that suit your needs, instead of installing a do-it-all monolith.

Our preliminary results seem to show that Alumet uses less CPU cycles, consumes less memory and is overall more efficient than the existing tools. We will upload benchmark results in the future.

Adaptive

We worked hard to provide two forms of adaptation:

  1. Adapting to your needs at compile-time: thanks to Alumet's modularity, you are able to build a measurement software tailored to your needs.
  2. Adapting to the context at run-time: unlike other tools, Alumet allows the pipeline to be reconfigured on-the-fly. With our novel approach, you can switch from monitoring at 1 Hz to profiling at 1000 Hz without restarting anything.

Rigorous

Finally, the project is based on active research work, involving both academia and industry. One of our goals is to overcome the limitations and mistakes that we found in the other tools. We want to produce a robust tool that will offer accurate measurements in different contexts, such as CS research, HPC clusters and Cloud services.

People at BULL SAS (part of Eviden) and the LIG (Grenoble's laboratory of computer science) are working on Alumet.

Installing Alumet

⚠️  Alumet is currently in Beta.

If you have trouble using Alumet, do not hesitate to discuss with us, we will help you find a solution. If you think that you have found a bug, please open an issue in the repository.

For the moment, the only way to use Alumet is to download its sources and to compile it (see below). We intend to provide easy-to-use packages in the future.

Compiling from source

Prerequisite: A recent version of Rust is required (at least 1.76 for now). You can run rustc --version to check your version. The easiest way to install a recent version of Rust is to use rustup.

Open a Terminal and clone the repository:

git clone https://github.com/alumet-dev/alumet.git

The Alumet repository contains multiple crates ("crates" are Rust libraries/packages). To run Alumet, we are interested in alumet-agent: a crate that produces a runnable measurement tool by compiling the core of Alumet and a set of standard plugins into a single executable binary.

Let's compile the agent:

cargo build -p alumet-agent

The binary should be located in target/debug/alumet-agent. You can check this with a simple ls:

ls target/debug/alumet-agent

If the agent is there, you can run it. Otherwise, look into the target directory to find the agent.

For the first time, let's use --help to learn about the available arguments.

❯ ./target/debug/alumet-agent --help                                          
Alumet standard agent: measure energy and performance metrics

Usage: alumet-agent [OPTIONS] [COMMAND]

Commands:
  run      Run the agent and monitor the system
  exec     Execute a command and observe its process
  config   Manipulate the configuration
  plugins  Get plugins information
  help     Print this message or the help of the given subcommand(s)

Options:
      --config <CONFIG>
          Path to the config file
          
          [env: ALUMET_CONFIG=]
          [default: alumet-config.toml]

      --no-default-config
          If set, the config file must exist, otherwise the agent will fail
          to start with an error

      --config-override <CONFIG_OVERRIDE>
          Config options overrides.
          
          Use dots to separate TOML levels, ex. `plugins.rapl.poll_interval='1ms'`

      --plugins <PLUGINS>
          List of plugins to enable, separated by commas, ex. `csv,rapl`.
          
          All the other plugins will be disabled.

[...]

I have omitted some lines here, run the agent with the --help flag to discover the actual output :)

Choosing the plugins that you need

The standard agent, which you have just compiled, contains multiple Alumet plugins. Each plugin can measure things, compute additional values based on the measurements, and/or save the measurements to a storage.

To see which plugins are included in the agent, run:

❯ ./target/debug/alumet-agent plugins list
[...]
Available plugins:
- OAR3 v0.1.0
- csv v0.2.0
- influxdb v0.1.0
- k8s v0.1.0
- mongodb v0.1.0
- nvidia v0.3.0
- oar2-plugin v0.1.0
- perf v0.1.0
- procfs v0.1.0
- rapl v0.3.1
- relay-client v0.6.0
- relay-server v0.6.0
- socket-control v0.2.0

Edit the configuration file or use the --plugins flag to enable/disable plugins.

For your first time, let's enable a very simple set of plugins:

  • rapl: a plugin that measures the energy consumption of your CPU
  • csv: a plugin that saves all the measurements to a CSV file

Run Alumet with these plugins by using the --plugins flag:

❯ ./target/debug/alumet-agent --plugins rapl,csv
[2025-02-20T13:25:26Z INFO  alumet_agent] Starting Alumet agent 'alumet-agent' v0.8.0-6c72253-dirty (2025-02-20T12:40:49.666737256Z, rustc 1.84.0, debug=true)
[2025-02-20T13:25:26Z WARN  alumet_agent] DEBUG assertions are enabled, this build of Alumet is fine for debugging, but not for production.
[2025-02-20T13:25:26Z INFO  alumet::agent::builder] Initializing the plugins...
[2025-02-20T13:25:26Z INFO  alumet::agent::builder] 2 plugins initialized.
[2025-02-20T13:25:26Z INFO  alumet::agent::builder] Starting the plugins...
[2025-02-20T13:25:26Z INFO  plugin_rapl] Available RAPL domains: dram, package, platform, pp0, pp1
[2025-02-20T13:25:26Z INFO  alumet::agent::builder] Plugin startup complete.
    🧩 2 plugins started:
        - csv v0.2.0
        - rapl v0.3.1
    
    ⭕ 11 plugins disabled:
        - OAR3 v0.1.0
        - influxdb v0.1.0
        - k8s v0.1.0
        - mongodb v0.1.0
        - nvidia v0.3.0
        - oar2-plugin v0.1.0
        - perf v0.1.0
        - procfs v0.1.0
        - relay-client v0.6.0
        - relay-server v0.6.0
        - socket-control v0.2.0
    
    📏 1 metric registered:
        - rapl_consumed_energy: F64 (J)
    
    📥 1 source, 🔀 0 transform and 📝 1 output registered.
    
    🔔 0 metric listener registered.
    
[2025-02-20T13:25:26Z INFO  alumet::agent::builder] Running pre-pipeline-start hooks...
[2025-02-20T13:25:26Z INFO  alumet::agent::builder] Starting the measurement pipeline...
[2025-02-20T13:25:26Z INFO  alumet::pipeline::builder] Only one output and no transform, using a simplified and optimized measurement pipeline.
[2025-02-20T13:25:26Z INFO  alumet::agent::builder] 🔥 ALUMET measurement pipeline has started.
[2025-02-20T13:25:26Z INFO  alumet::agent::builder] Running post-pipeline-start hooks...
[2025-02-20T13:25:26Z INFO  alumet::agent::builder] 🔥 ALUMET agent is ready.

Alumet will start to monitor various hardware and software components. Notice how you can immediately see which plugins are used and which metrics they measure. Use Ctrl+C to shut the agent down.

If you get an error, refer to the following section.

Obtaining required privileges

Measuring some metrics, like RAPL energy counters and perf_events, require specific privileges (because we read low-level data). Alumet will warn you about missing privileges and will suggest commands to fix the issue (there are several options).

For example, if you get an error like this:

[2025-02-20T13:28:40Z WARN  plugin_rapl] I could not use perf_events to read RAPL energy counters: perf_event_open failed. Try to set kernel.perf_event_paranoid to 0 or -1, or to give CAP_PERFMON to the application's binary (CAP_SYS_ADMIN before Linux 5.8).

[...]
Error: startup failure

Caused by:
    0: plugin failed to start: rapl v0.3.1
    1: Could not open /sys/devices/virtual/powercap/intel-rapl/intel-rapl:1/energy_uj. Try to adjust file permissions.
    2: Permission denied (os error 13)

Read the logs and apply one of the proposed solutions. The easiest one is to run this command before starting Alumet:

sudo sysctl -w kernel.perf_event_paranoid=0

You only need to run it once per boot:if you reboot the machine, you need to do it again. See man sysctl.conf for a way of making this setting permanent.

Be careful about sudo

We recommend not to run the Alumet agent with sudo. It is better to give appropriate privileges to the agent binary or to make the system configuration more permissive.

In any case, please never use sudo cargo run, because that would compile the project with the root user, making it unusable for you.

Tips

Path to the binary

The binary produced by cargo is located, when building with a default target (which links to libc) and for the host architecture, at:

  • target/debug/alumet-agent in debug mode
  • target/release/alumet-agent in release mode

The aforementioned paths are relative to the root directory of the Git repository.

Release mode

By default, the measurement tool is built in debug mode, which enables better diagnostics but disables many optimizations. To deploy Alumet "in production", you should use the release mode by adding --release to the cargo flags.

cargo build -p alumet-agent --release

The optimized agent will be saved to target/release/alumet-agent.

CSV output file

The default CSV file used by the csv plugin is alumet-output.csv. You can change this by editing alumet-config.toml, or by using the --output-file flag.

Configuration file

The configuration file is automatically created by Alumet if it does not exist. Its content depends on the set of enabled plugins.

-> Learn more about Alumet config here.

Multiple agents?

As of PR#93 (merged in Alumet 0.8.0-d7565a6), we provide one standard agent with all the "official" plugins. You can still create your own custom agent in a separate crate. See the PR description for more information.

Configuration file

The file alumet-config.toml contains the configuration of the Alumet agent. It is automatically created by Alumet if it does not exist.

Commented example

Here is a commented example of the configuration file that is generated by the agent.

# -- global agent config --

# upper bound of the interval between two updates of the commands received by the measurement sources
max_update_interval = "500ms"

# -- plugins configs, one table per plugin --

[plugins.rapl]
# Interval between each measurement of the RAPL plugin.
# Most plugins that provide measurement sources also provide this configuration option.
# Example: "1s" = 1 second
# Example: "1ms" = 1 millisecond
poll_interval = "1s"

# Measurements are kept in a buffer and are only sent to the next step of the Alumet pipeline
# when the flush interval expires.
flush_interval = "5s"

# Set to "true" to disable perf_events and only use powercap instead.
# By default, the rapl plugin tries to use perf_events, and use powercap if that fails.
no_perf_events = false

[plugins.csv]
# Path to the output file that contains the measurements.
output_path = "alumet-output.csv"

# Flush the file writer after each write operation.
# A "write operation" may contain multiple measurement points.
force_flush = true

# In the "metric" column, append the unit to the name of each metric (except if the unit's name is empty).
# For instance, a metric "energy" with a unit "Joules" (symbol "J") will be serialized as "energy_J".
append_unit_to_metric_name = true

# Use the display name of the units instead of their unique name, as specified by the UCUM.
# See https://ucum.org/ucum for a list of unit and their symbols.
use_unit_display_name = true

# The character to use to separate CSV columns.
csv_delimiter = ";"

[plugins.perf]
# List of hardware perf_events to measure.
hardware_events = ["REF_CPU_CYCLES", "CACHE_MISSES", "BRANCH_MISSES"]

# List of software perf_events to measure.
software_events = []

# List of cache perf_events to measure.
cache_events = ["LL_READ_MISS"]

Regenerating the file

When you change the plugins that are included in the agent, or when you install a new version of Alumet, the configuration options may change. You can replace the existing configuration file by a fresh, updated version of the configuration by using the config regen command.

Example:

alumet-agent config regen
# Note: replace alumet-agent by the path to the binary application, or by `cargo run --`

Example (if you use cargo run):

cargo run -- config regen

Command-line arguments

Some command-line arguments override the options defined in the configuration file. This is the case of --max-update-interval, which overrides max_update_interval from the config file.

Execution mode

When started with the exec command, the Alumet agent spawns a new process with the specified command and stops when the process exits.

The execution mode automatically make some plugins, such as the perf plugin, gather more metrics about the spawned process.

Example

Let's try this feature with a simple sleep command.

alumet-agent exec sleep 1

With the standard plugins (rapl, perf, csv), the resulting CSV file looks like the following (formatted to make it easier to read).

metric                       ;timestamp                      ;value             ;resource_kind ;resource_id ;consumer_kind ;consumer_id ;__late_attributes
perf_hardware_REF_CPU_CYCLES ;2024-05-14T21:28:49.416768909Z ; 0                ;local_machine ;            ;process       ;     728039 ;
perf_hardware_CACHE_MISSES   ;2024-05-14T21:28:49.416768909Z ; 0                ;local_machine ;            ;process       ;     728039 ;
perf_hardware_BRANCH_MISSES  ;2024-05-14T21:28:49.416768909Z ; 0                ;local_machine ;            ;process       ;     728039 ;
perf_cache_LL_READ_MISS      ;2024-05-14T21:28:49.416768909Z ; 0                ;local_machine ;            ;process       ;     728039 ;
rapl_consumed_energy_J       ;2024-05-14T21:28:50.389874134Z ; 0.89825439453125 ;dram          ;          0 ;local_machine ;            ;domain=dram
rapl_consumed_energy_J       ;2024-05-14T21:28:50.389874134Z ; 5.779296875      ;cpu_package   ;          0 ;local_machine ;            ;domain=pp0
rapl_consumed_energy_J       ;2024-05-14T21:28:50.389874134Z ;50.3060302734375  ;local_machine ;            ;local_machine ;            ;domain=platform
rapl_consumed_energy_J       ;2024-05-14T21:28:50.389874134Z ; 0                ;cpu_package   ;          0 ;local_machine ;            ;domain=pp1
rapl_consumed_energy_J       ;2024-05-14T21:28:50.389874134Z ;10.3299560546875  ;cpu_package   ;          0 ;local_machine ;            ;domain=package

There are several things to note here.

First, as expected, a simple sleep does not use any cpu cycle. This is reported by the perf_hardware_REF_CPU_CYCLES metric.

Second, the computer consumed some energy during the sleep. This is reported by the rapl_consumed_energy_J metric. The J indicates that the measurements are in Joules. Note that perf_hardware_REF_CPU_CYCLES does not have a unit suffix, because it's a dimensionless value: a counter. In any case, the CSV plugin can be configured not to include the suffix in the resulting file. But let's go back to the RAPL metric. Here, the metric is given five times, because five different RAPL domains are available on this machine (dram, pp0, pp1, package and platform).

⚠️ As indicated by the value local_machine in the consumer_kind column, the metric rapl_consumed_energy_J does not report the energy consumed by the process spawned with alumet-agent exec, but the total energy consumption of the associated RAPL domain (since the previous measurement of the metric, but here we only have one value).

Finally, the timestamps are serialized in the UTC timezone, hence the Z suffix.