SkillAgentSearch skills...

Resile

Resile is an ergonomic, type-safe execution resilience and retry library for Go. Inspired by Python’s stamina, it features generic execution wrappers, AWS Full Jitter backoff, native Retry-After header support, and zero-dependency observability for distributed systems.

Install / Use

/learn @cinar/Resile

README

Resile: Ergonomic Execution Resilience for Go

Go Reference License Build Status Codecov Go Report Card

Resile is a production-grade execution resilience and retry library for Go, inspired by Python's stamina. It provides a type-safe, ergonomic, and highly observable way to handle transient failures in distributed systems.


Table of Contents


Installation

go get github.com/cinar/resile

Why Resile?

In distributed systems, transient failures are a mathematical certainty. Resile simplifies the "Correct Way" to retry:

  • AWS Full Jitter: Uses the industry-standard algorithm to prevent "thundering herd" synchronization.
  • Adaptive Retries: Built-in token bucket rate limiting to prevent "retry storms" across a cluster.
  • Generic-First: No interface{} or reflection. Full compile-time type safety.
  • Context-Aware: Strictly respects context.Context cancellation and deadlines.
  • Zero-Dependency Core: The core library only depends on the Go standard library.
  • Opinionated Defaults: Sensible production-ready defaults (5 attempts, exponential backoff).
  • Chaos-Ready: Built-in support for fault and latency injection to test your resilience policies.

Articles & Tutorials

Want to learn more about the philosophy behind Resile and advanced resilience patterns in Go? Check out these deep dives:

Examples

The examples/ directory contains standalone programs showing how to use Resile in various scenarios:


Common Use Cases

1. Simple Retries

Retry a simple operation that only returns an error. If all retries fail, Resile returns an aggregated error containing the failures from every attempt.

err := resile.DoErr(ctx, func(ctx context.Context) error {
    return db.PingContext(ctx)
})

2. Value-Yielding Retries (Generics)

Fetch data with full type safety. The return type is inferred from your closure.

// val is automatically inferred as *User
user, err := resile.Do(ctx, func(ctx context.Context) (*User, error) {
    return apiClient.GetUser(ctx, userID)
}, resile.WithMaxAttempts(3))

3. Request Hedging (Speculative Retries)

Speculative retries reduce tail latency by starting a second request if the first one doesn't finish within a configured HedgingDelay. The first successful result is used, and the other is cancelled.

// For value-yielding operations
data, err := resile.DoHedged(ctx, action, 
    resile.WithMaxAttempts(3),
    resile.WithHedgingDelay(100 * time.Millisecond),
)

// For error-only operations
err := resile.DoErrHedged(ctx, action,
    resile.WithMaxAttempts(2),
    resile.WithHedgingDelay(50 * time.Millisecond),
)

4. Stateful Retries & Endpoint Rotation

Use DoState (or DoErrState) to access the RetryState, allowing you to rotate endpoints or fallback logic based on the failure history.

endpoints := []string{"api-v1.example.com", "api-v2.example.com"}

data, err := resile.DoState(ctx, func(ctx context.Context, state resile.RetryState) (string, error) {
    // Rotate endpoint based on attempt number
    url := endpoints[state.Attempt % uint(len(endpoints))]
    return client.Get(ctx, url)
})

5. Handling Rate Limits (Retry-After)

Resile automatically detects if an error implements RetryAfterError. It can override the jittered backoff with a server-dictated duration and can also signal immediate termination (pushback).

type RateLimitError struct {
    WaitUntil time.Time
}

func (e *RateLimitError) Error() string { return "too many requests" }
func (e *RateLimitError) RetryAfter() time.Duration {
    return time.Until(e.WaitUntil)
}
func (e *RateLimitError) CancelAllRetries() bool {
    // Return true to abort the entire retry loop immediately.
    return false 
}

// Resile will sleep exactly until WaitUntil when this error is encountered.

6. Aborting Retries (Pushback Signal)

If a downstream service returns a terminal error (like "Quota Exceeded") that shouldn't be retried, implement CancelAllRetries() bool to abort the entire retry loop immediately.

type QuotaExceededError struct{}
func (e *QuotaExceededError) Error() string { return "quota exhausted" }
func (e *QuotaExceededError) CancelAllRetries() bool { return true }

// Resile will stop immediately if this error is encountered,
// even if more attempts are remaining.
_, err := resile.Do(ctx, action, resile.WithMaxAttempts(10))

7. Fallback Strategies

Provide a fallback function to handle cases where all retries are exhausted or the circuit breaker is open. This is useful for returning stale data or default values.

data, err := resile.Do(ctx, fetchData,
    resile.WithMaxAttempts(3),
    resile.WithFallback(func(ctx context.Context, err error) (string, error) {
        // Return stale data from cache if the primary fetch fails
        return cache.Get(ctx, key), nil 
    }),
)

8. Bulkhead Pattern

Isolate failures by limiting the number of concurrent executions to a specific resource.

// Shared bulkhead with capacity of 10
bh := resile.NewBulkhead(10)

err := resile.DoErr(ctx,
View on GitHub
GitHub Stars36
CategoryCustomer
Updated1d ago
Forks1

Languages

Go

Security Score

95/100

Audited on Apr 4, 2026

No findings