As you might already be aware, last week there was an outage which caused heimdall and bor to stall. Since the issue was in heimdall, we had unblocked bor by creating few spans (replicating the last one before the chain stopped creating blocks). The spans created then will run through 20th March 2022. Before this point, we need to ensure that heimdall is in a state to start creating new spans normally. This requires an update to heimdall where we will be backfilling these spans to the heimdall db. The update involves a hard-fork and all nodes need to upgrade before block number 8664000
that will be mined approximately at 10:30AM UTC on Friday, 18th March 2022.
The main PR that will be going live in this release: https://github.com/maticnetwork/heimdall/pull/791
Update: We have also added the instructions to setup telemetry for heimdall which will help us to monitor the overall network health and check the version of the nodes listed.
Instructions to Upgrade
These are the instructions to upgrade heimdall on your nodes:
-
Stop heimdall processes:
sudo service heimdalld stop sudo service heimdalld-rest-server stop sudo service heimdalld-bridge stop
-
Upgrade heimdall to the latest version:
cd ~/heimdall git pull git checkout v0.2.8 make install
-
You should see output similar to:
==================================================== ==================Build For Overriding Spans Successful================== ====================================================
-
Ensure that you are on the latest version:
heimdalld version # It should return # 0.2.8
-
Start heimdall services
sudo service heimdalld restart sudo service heimdalld-rest-server restart # Only on validator nodes: sudo service heimdalld-bridge restart
Telemetry Setup for Heimdall
These are the instructions to setup telemetry for heimdall to monitor the nodes in https://heimdall-mainnet.vitwit.com/
-
Clone and setup the config
git clone https://github.com/vitwit/matic-telemetry.git cd matic-telemetry mkdir -p ~/.telemetry/config cp example.config.toml ~/.telemetry/config/config.toml
-
Update the config for your node located in
~/.telemetry/config/config.toml
. Update the [stats_details] block in config.Secret key and net stats ip will remain same as mentioned below. Update the node key with the name of your node.
[stats_details] secret_key = "heimdall_mainnet" node = "<node-name>" net_stats_ip = "heimdall-mainnet.vitwit.com:3000"
Please use the following naming convention for node name:
<entity_name>-<network_name>--<unique_identifier>Eg:
vitwit-mainnet-sentry-ip10_0_0_1
blockvigil-mainnet-explorer-1
unique_identifier is just an identifier for you to distinguish between different nodes you are running (if you have multiple nodes).
-
Build the binary (from the
matic-telemetry
directory)go mod tidy go build -o telemetry mv telemetry $GOBIN
-
Create a service (You can use different service name to resolve conflicts with your local system if any. If you do so, make sure you use that name in subsequent commands below)
echo "[Unit] Description=Telemetry After=network-online.target [Service] User=$USER ExecStart=$(which telemetry) Restart=always RestartSec=3 LimitNOFILE=4096 [Install] WantedBy=multi-user.target" | sudo tee "/lib/systemd/system/telemetry.service"
-
Start the telemetry service
sudo systemctl enable telemetry.service sudo systemctl start telemetry.service
-
You can monitor the service using journalctl. If everything works well, it should start exporting metrics.
journalctl -u telemetry -f
-
Ensure that your node name is listed in https://heimdall-mainnet.vitwit.com/