Skip to content

Commit eb23369

Browse files
Scrape interval per metric (#119)
Signed-off-by: Anders Swanson <anders.swanson@oracle.com>
1 parent b509205 commit eb23369

File tree

4 files changed

+100
-55
lines changed

4 files changed

+100
-55
lines changed

‎README.md

Lines changed: 33 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -616,28 +616,27 @@ The following command line arguments (flags) can be passed to the exporter:
616616

617617
```bash
618618
Usage of oracledb_exporter:
619-
--log.format value
620-
If set use a syslog logger or JSON logging. Example: logger:syslog?appname=bob&local=7 or logger:stdout?json=true. Defaults to stderr.
621-
--log.level value
622-
Only log messages with the given severity or above. Valid levels: [debug, info, warn, error, fatal].
623-
--custom.metrics string
624-
Comma separated list of file(s) that contain various custom metrics in a TOML format.
625-
--default.metrics string
626-
Default TOML file metrics.
627-
--web.systemd-socket
628-
Use systemd socket activation listeners instead of port listeners (Linux only).
629-
--web.listen-address string
630-
Address to listen on for web interface and telemetry. (default ":9161")
631-
--web.telemetry-path string
632-
Path under which to expose metrics. (default "/metrics")
633-
--database.maxIdleConns string
634-
Number of maximum idle connections in the connection pool. (default "0")
635-
--database.maxOpenConns string
636-
Number of maximum open connections in the connection pool. (default "10")
637-
--query.timeout int
638-
Query timeout (in seconds).
639-
--web.config.file
640-
Path to configuration file that can enable TLS or authentication.
619+
--web.telemetry-path="/metrics"
620+
Path under which to expose metrics. (env: TELEMETRY_PATH)
621+
--default.metrics="default-metrics.toml"
622+
File with default metrics in a TOML file. (env: DEFAULT_METRICS)
623+
--custom.metrics="" Comma separated list of file(s) that contain various custom metrics in a TOML format. (env: CUSTOM_METRICS)
624+
--query.timeout=5 Query timeout (in seconds). (env: QUERY_TIMEOUT)
625+
--database.maxIdleConns=0 Number of maximum idle connections in the connection pool. (env: DATABASE_MAXIDLECONNS)
626+
--database.maxOpenConns=10
627+
Number of maximum open connections in the connection pool. (env: DATABASE_MAXOPENCONNS)
628+
--scrape.interval=0s Interval between each scrape. Default is to scrape on collect requests.
629+
--log.disable=0 Set to 1 to disable alert logs
630+
--log.interval=15s Interval between log updates (e.g. 5s).
631+
--log.destination="/log/alert.log"
632+
File to output the alert log to. (env: LOG_DESTINATION)
633+
--web.listen-address=:9161 ...
634+
Addresses on which to expose metrics and web interface. Repeatable for multiple addresses.
635+
--web.config.file="" Path to configuration file that can enable TLS or authentication. See: https://github.com/prometheus/exporter-toolkit/blob/master/docs/web-configuration.md
636+
--log.level=info Only log messages with the given severity or above. One of: [debug, info, warn, error]
637+
--log.format=logfmt Output format of log messages. One of: [logfmt, json]
638+
--[no-]version Show application version.
639+
641640
```
642641

643642
### Using OCI Vault
@@ -658,17 +657,18 @@ exporter, you can:
658657

659658
Custom metrics file must contain a series of `[[metric]]` definitions, in TOML. Each metric definition must follow the custom metric schema:
660659

661-
| Field Name | Description | Type | Required | Default |
662-
|------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------|----------|------------------------------|
663-
| context | Metric context, used to build metric FQN | String | Yes | |
664-
| labels | Metric labels, which must match column names in the query. Any column that is not a label will be parsed as a metric | Array of Strings | No | |
665-
| metricsdesc | Mapping between field(s) in the request and comment(s) | Dictionary of Strings | Yes | |
666-
| metricstype | Mapping between field(s) in the request and [Prometheus metric types](https://prometheus.io/docs/concepts/metric_types/) | Dictionary of Strings | No | |
667-
| metricsbuckets | Split [histogram](https://prometheus.io/docs/concepts/metric_types/#histogram) metric types into buckets based on value ([example](./custom-metrics-example/metric-histogram-example.toml)) | Dictionary of String dictionaries | No | |
668-
| fieldtoappend | Field from the request to append to the metric FQN | String | No | |
669-
| request | Oracle database query to run for metrics scraping | String | Yes | |
670-
| ignorezeroresult | Whether or not an error will be printed if the request does not return any results | Boolean | No | false |
671-
| querytimeout | Oracle Database query timeout, in seconds | Integer | No | 5, or value of query.timeout |
660+
| Field Name | Description | Type | Required | Default |
661+
|------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------|----------|-----------------------------------|
662+
| context | Metric context, used to build metric FQN | String | Yes | |
663+
| labels | Metric labels, which must match column names in the query. Any column that is not a label will be parsed as a metric | Array of Strings | No | |
664+
| metricsdesc | Mapping between field(s) in the request and comment(s) | Dictionary of Strings | Yes | |
665+
| metricstype | Mapping between field(s) in the request and [Prometheus metric types](https://prometheus.io/docs/concepts/metric_types/) | Dictionary of Strings | No | |
666+
| metricsbuckets | Split [histogram](https://prometheus.io/docs/concepts/metric_types/#histogram) metric types into buckets based on value ([example](./custom-metrics-example/metric-histogram-example.toml)) | Dictionary of String dictionaries | No | |
667+
| fieldtoappend | Field from the request to append to the metric FQN | String | No | |
668+
| request | Oracle database query to run for metrics scraping | String | Yes | |
669+
| ignorezeroresult | Whether or not an error will be printed if the request does not return any results | Boolean | No | false |
670+
| querytimeout | Oracle Database query timeout duration, e.g., 300ms, 0.5h | String duration | No | Value of query.timeout in seconds |
671+
| scrapeinterval | Custom metric scrape interval, used if scrape.interval is provided, otherwise metrics are always scraped on request. | String duration | No | |
672672

673673
Here's a simple example of a metric definition:
674674

‎collector/collector.go

Lines changed: 23 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ type Exporter struct {
4444
dbtypeGauge prometheus.Gauge
4545
db *sql.DB
4646
logger log.Logger
47+
lastTick *time.Time
4748
}
4849

4950
// Config is the configuration of the exporter
@@ -81,7 +82,8 @@ type Metric struct {
8182
FieldToAppend string
8283
Request string
8384
IgnoreZeroResult bool
84-
QueryTimeout int
85+
QueryTimeout string
86+
ScrapeInterval string
8587
}
8688

8789
// Metrics is a container structure for prometheus metrics
@@ -199,7 +201,7 @@ func (e *Exporter) Collect(ch chan<- prometheus.Metric) {
199201
// otherwise do a normal scrape per request
200202
e.mu.Lock() // ensure no simultaneous scrapes
201203
defer e.mu.Unlock()
202-
e.scrape(ch)
204+
e.scrape(ch, nil)
203205
ch <- e.duration
204206
ch <- e.totalScrapes
205207
ch <- e.error
@@ -217,17 +219,19 @@ func (e *Exporter) RunScheduledScrapes(ctx context.Context, si time.Duration) {
217219

218220
for {
219221
select {
220-
case <-ticker.C:
222+
case tick := <-ticker.C:
223+
221224
e.mu.Lock() // ensure no simultaneous scrapes
222-
e.scheduledScrape()
225+
e.scheduledScrape(&tick)
226+
e.lastTick = &tick
223227
e.mu.Unlock()
224228
case <-ctx.Done():
225229
return
226230
}
227231
}
228232
}
229233

230-
func (e *Exporter) scheduledScrape() {
234+
func (e *Exporter) scheduledScrape(tick *time.Time) {
231235
metricCh := make(chan prometheus.Metric, 5)
232236

233237
wg := &sync.WaitGroup{}
@@ -244,20 +248,19 @@ func (e *Exporter) scheduledScrape() {
244248
return
245249
}
246250
}()
247-
e.scrape(metricCh)
251+
e.scrape(metricCh, tick)
248252

249253
// report metadata metrics
250254
metricCh <- e.duration
251255
metricCh <- e.totalScrapes
252256
metricCh <- e.error
253257
e.scrapeErrors.Collect(metricCh)
254258
metricCh <- e.up
255-
256259
close(metricCh)
257260
wg.Wait()
258261
}
259262

260-
func (e *Exporter) scrape(ch chan<- prometheus.Metric) {
263+
func (e *Exporter) scrape(ch chan<- prometheus.Metric, tick *time.Time) {
261264
e.totalScrapes.Inc()
262265
var err error
263266
defer func(begun time.Time) {
@@ -336,7 +339,7 @@ func (e *Exporter) scrape(ch chan<- prometheus.Metric) {
336339
}
337340

338341
scrapeStart := time.Now()
339-
if err = e.ScrapeMetric(e.db, ch, metric); err != nil {
342+
if err = e.ScrapeMetric(e.db, ch, metric, tick); err != nil {
340343
if !metric.IgnoreZeroResult {
341344
// do not print repetitive error messages for metrics
342345
// with ignoreZeroResult set to true
@@ -470,17 +473,21 @@ func (e *Exporter) reloadMetrics() {
470473
}
471474

472475
// ScrapeMetric is an interface method to call scrapeGenericValues using Metric struct values
473-
func (e *Exporter) ScrapeMetric(db *sql.DB, ch chan<- prometheus.Metric, m Metric) error {
476+
func (e *Exporter) ScrapeMetric(db *sql.DB, ch chan<- prometheus.Metric, m Metric, tick *time.Time) error {
474477
level.Debug(e.logger).Log("msg", "Calling function ScrapeGenericValues()")
475-
return e.scrapeGenericValues(db, ch, m.Context, m.Labels, m.MetricsDesc,
476-
m.MetricsType, m.MetricsBuckets, m.FieldToAppend, m.IgnoreZeroResult,
477-
m.Request, m.QueryTimeout)
478+
if e.isScrapeMetric(tick, m) {
479+
queryTimeout := e.getQueryTimeout(m)
480+
return e.scrapeGenericValues(db, ch, m.Context, m.Labels, m.MetricsDesc,
481+
m.MetricsType, m.MetricsBuckets, m.FieldToAppend, m.IgnoreZeroResult,
482+
m.Request, queryTimeout)
483+
}
484+
return nil
478485
}
479486

480487
// generic method for retrieving metrics.
481488
func (e *Exporter) scrapeGenericValues(db *sql.DB, ch chan<- prometheus.Metric, context string, labels []string,
482489
metricsDesc map[string]string, metricsType map[string]string, metricsBuckets map[string]map[string]string,
483-
fieldToAppend string, ignoreZeroResult bool, request string, queryTimeout int) error {
490+
fieldToAppend string, ignoreZeroResult bool, request string, queryTimeout time.Duration) error {
484491
metricsCount := 0
485492
genericParser := func(row map[string]string) error {
486493
// Construct labels value
@@ -586,13 +593,8 @@ func (e *Exporter) scrapeGenericValues(db *sql.DB, ch chan<- prometheus.Metric,
586593

587594
// inspired by https://kylewbanks.com/blog/query-result-to-map-in-golang
588595
// Parse SQL result and call parsing function to each row
589-
func (e *Exporter) generatePrometheusMetrics(db *sql.DB, parse func(row map[string]string) error, query string, queryTimeout int) error {
590-
timeout := e.config.QueryTimeout
591-
if queryTimeout > 0 {
592-
timeout = queryTimeout
593-
}
594-
timeoutDuration := time.Duration(timeout) * time.Second
595-
ctx, cancel := context.WithTimeout(context.Background(), timeoutDuration)
596+
func (e *Exporter) generatePrometheusMetrics(db *sql.DB, parse func(row map[string]string) error, query string, queryTimeout time.Duration) error {
597+
ctx, cancel := context.WithTimeout(context.Background(), queryTimeout)
596598
defer cancel()
597599
rows, err := db.QueryContext(ctx, query)
598600

‎collector/metrics.go

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,51 @@ package collector
66
import (
77
"github.com/go-kit/log/level"
88
"strconv"
9+
"time"
910
)
1011

12+
// isScrapeMetric returns true if a metric should be scraped. Metrics may not be scraped if they have a custom scrape interval,
13+
// and the time since the last scrape is less than the custom scrape interval.
14+
// If there is no tick time or last known tick, the metric is always scraped.
15+
func (e *Exporter) isScrapeMetric(tick *time.Time, metric Metric) bool {
16+
// Always scrape the metric if we don't have a current or last known tick.
17+
if tick == nil || e.lastTick == nil {
18+
return true
19+
}
20+
// If the metric doesn't have a custom scrape interval, scrape it.
21+
interval, ok := e.getScrapeInterval(metric.Context, metric.ScrapeInterval)
22+
if !ok {
23+
return true
24+
}
25+
// If the metric's scrape interval is less than the time elapsed since the last scrape,
26+
// we should scrape the metric.
27+
return interval < tick.Sub(*e.lastTick)
28+
}
29+
30+
func (e *Exporter) getScrapeInterval(context, scrapeInterval string) (time.Duration, bool) {
31+
if len(scrapeInterval) > 0 {
32+
si, err := time.ParseDuration(scrapeInterval)
33+
if err != nil {
34+
level.Error(e.logger).Log("msg", "Unable to convert scrapeinterval to duration (metric="+context+")")
35+
return 0, false
36+
}
37+
return si, true
38+
}
39+
return 0, false
40+
}
41+
42+
func (e *Exporter) getQueryTimeout(metric Metric) time.Duration {
43+
if len(metric.QueryTimeout) > 0 {
44+
qt, err := time.ParseDuration(metric.QueryTimeout)
45+
if err != nil {
46+
level.Error(e.logger).Log("msg", "Unable to convert querytimeout to duration (metric="+metric.Context+")")
47+
return time.Duration(e.config.QueryTimeout) * time.Second
48+
}
49+
return qt
50+
}
51+
return time.Duration(e.config.QueryTimeout) * time.Second
52+
}
53+
1154
func (e *Exporter) parseFloat(metric, metricHelp string, row map[string]string) (float64, bool) {
1255
value, ok := row[metric]
1356
if !ok {

‎main.go

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@ var (
4141
queryTimeout = kingpin.Flag("query.timeout", "Query timeout (in seconds). (env: QUERY_TIMEOUT)").Default(getEnv("QUERY_TIMEOUT", "5")).Int()
4242
maxIdleConns = kingpin.Flag("database.maxIdleConns", "Number of maximum idle connections in the connection pool. (env: DATABASE_MAXIDLECONNS)").Default(getEnv("DATABASE_MAXIDLECONNS", "0")).Int()
4343
maxOpenConns = kingpin.Flag("database.maxOpenConns", "Number of maximum open connections in the connection pool. (env: DATABASE_MAXOPENCONNS)").Default(getEnv("DATABASE_MAXOPENCONNS", "10")).Int()
44-
scrapeInterval = kingpin.Flag("scrape.interval", "Interval between each scrape. Default is to scrape on collect requests").Default("0s").Duration()
44+
scrapeInterval = kingpin.Flag("scrape.interval", "Interval between each scrape. Default is to scrape on collect requests.").Default("0s").Duration()
4545
logDisable = kingpin.Flag("log.disable", "Set to 1 to disable alert logs").Default("0").Int()
4646
logInterval = kingpin.Flag("log.interval", "Interval between log updates (e.g. 5s).").Default("15s").Duration()
4747
logDestination = kingpin.Flag("log.destination", "File to output the alert log to. (env: LOG_DESTINATION)").Default(getEnv("LOG_DESTINATION", "/log/alert.log")).String()

0 commit comments

Comments
 (0)