Prometheus & Grafana Monitoring
After deploying the Monkeys application, you can monitor the running status and performance metrics of Monkeys services using Prometheus & Grafana.
Monkeys Server Metrics
Monkeys Server is the core service of Monkeys, exposing metrics such as CPU, memory, and network of the Node.js application.
Demonstration
Prometheus Configuration
Add the following configuration to the Prometheus configuration file:
global.scrape_interval
: Monitoring interval, recommended to be5s
.scrape_configs[].metrics_path
: The Prometheus metrics endpoint for Monkeys Server is/metrics
.scrape_configs[].static_configs[].targets
: The address and port of the Monkeys Server service, you may need to modify it according to the actual deployment.
Grafana Configuration
Configure Prometheus Data Source
- In the Grafana console, go to Connections - Data Sources and add a Prometheus data source:
- Prometheus server URL: Enter the address and port of the Prometheus service.
- Scrape interval: Monitoring interval, recommended to be the shortest interval (1s).
- Click Save & Test to save and test if the Prometheus data source is configured correctly.
- Import the monitoring dashboard
- In the Grafana console, go to the Dashboard page, click the New button.
- Select the Import option.
- In the Grafana.com Dashboard URL or ID field, enter
11159
(https://grafana.com/grafana/dashboards/11159-nodejs-application-dashboard/).
Conductor Metrics
Conductor is the workflow orchestration engine used by Monkeys, exposing metrics such as the number of currently running workflows, completed workflows, failed workflows, JVM memory, database connections, etc.
Demonstration
Prometheus Configuration
Add the following configuration to the Prometheus configuration file:
global.scrape_interval
: Monitoring interval, recommended to be the shortest interval (1s), setting the interval too long may result in not being able to capture the actual number of currently running workflows in real-time.scrape_configs[].metrics_path
: The Prometheus metrics endpoint for Conductor is/actuator/prometheus
.scrape_configs[].static_configs[].targets
: The address and port of the Conductor service, you may need to modify it according to the actual deployment.
Grafana Configuration
Configure Prometheus Data Source
- In the Grafana console, go to Connections - Data Sources and add a Prometheus data source:
- Prometheus server URL: Enter the address and port of the Prometheus service.
- Scrape interval: Monitoring interval, recommended to be the shortest interval (1s).
- Click Save & Test to save and test if the Prometheus data source is configured correctly.
- Import the monitoring dashboard
- In the Grafana console, go to the Dashboard page, click the New button.
- Select the Import option.
- Import the following JSON model through the dashboard JSON model
Expand JSON Content
VLLM Metrics
VLLM is used to deploy and manage large language models compatible with OpenAI’s API, exposing metrics such as model loading time, inference time, and memory usage.
Demonstration
Prometheus Configuration
Add the following configuration to the Prometheus configuration file:
global.scrape_interval
: Monitoring interval, recommended to be5s
.scrape_configs[].metrics_path
: The Prometheus metrics endpoint for VLLM is/metrics
.scrape_configs[].static_configs[].targets
: The address and port of the VLLM service, you may need to modify it according to the actual deployment.
Grafana Configuration
Configure Prometheus Data Source
- In the Grafana console, go to Connections - Data Sources and add a Prometheus data source:
- Prometheus server URL: Enter the address and port of the Prometheus service.
- Scrape interval: Monitoring interval, recommended to be the shortest interval (1s).
- Click Save & Test to save and test if the Prometheus data source is configured correctly.
- Import the monitoring dashboard
- In the Grafana console, go to the Dashboard page, click the New button.
- Select the Import option.
- Import the following JSON model through the dashboard JSON model
Expand JSON Content
JSON content from https://github.com/vllm-project/vllm/blob/main/examples/production_monitoring/grafana.json