Building Resilient .NET 10 Applications with Polly
In today’s distributed and often unpredictable environments, building resilient applications is paramount. .NET 10, with its performance enhancements and modern features, provides a strong foundation. However, even the most robust .NET application can stumble when faced with transient faults, network hiccups, or overloaded services. That’s where Polly comes in. Polly is a .NET resilience and transient-fault-handling library that allows developers to gracefully handle these failures and keep their applications running smoothly. This comprehensive guide will explore how to leverage Polly to build truly resilient .NET 10 applications.
Why Resilience Matters in .NET 10 Applications
Before diving into the specifics of Polly, let’s understand why resilience is so critical in modern .NET development:
- Microservices Architecture: Many .NET applications are now built using a microservices architecture. This means applications are composed of numerous independent services, each potentially deployed on different servers and managed by different teams. The complexity of communication between these services increases the likelihood of failures.
- Cloud Environments: Deploying to the cloud (Azure, AWS, GCP) offers scalability and flexibility, but also introduces dependencies on network connectivity and external services, which can be unreliable.
- Transient Faults: Transient faults are temporary and often self-correcting errors that can occur in distributed systems. Examples include network glitches, temporary server overload, and database connection timeouts.
- Improved User Experience: Resilient applications provide a better user experience by masking failures, retrying operations, and gracefully degrading functionality when necessary. Users are less likely to encounter errors or experience application downtime.
- Reduced Operational Costs: By automatically handling transient faults, resilience strategies reduce the need for manual intervention and minimize the impact of failures on overall system performance.
Introducing Polly: Your .NET Resilience Toolkit
Polly is a .NET library that provides a fluent and thread-safe way to express resilience policies such as:
- Retry: Automatically retries an operation after a failure.
- Circuit Breaker: Prevents an application from repeatedly trying to execute an operation that is likely to fail, allowing the failing service to recover.
- Timeout: Aborts an operation if it exceeds a specified duration.
- Bulkhead Isolation: Limits the number of concurrent calls to a resource, preventing it from being overwhelmed.
- Cache: Caches the results of operations to reduce the load on downstream services.
- Fallback: Provides a default or alternative response when an operation fails.
Polly’s key features include:
- Fluent API: Polly provides a clean and intuitive fluent API for defining resilience policies.
- Asynchronous Support: Polly supports both synchronous and asynchronous operations.
- Extensibility: Polly is highly extensible, allowing you to create custom policies to meet specific requirements.
- Integration with .NET Ecosystem: Polly integrates seamlessly with other .NET libraries and frameworks, such as HttpClient and ASP.NET Core.
Setting Up Polly in Your .NET 10 Project
Adding Polly to your .NET 10 project is straightforward using NuGet Package Manager:
- Open NuGet Package Manager: In Visual Studio, go to Tools -> NuGet Package Manager -> Manage NuGet Packages for Solution…
- Search for Polly: Search for the “Polly” package.
- Install the Package: Select the Polly package and click “Install” for your project.
Alternatively, you can use the .NET CLI:
dotnet add package Polly
Implementing Resilience Policies with Polly
Let’s explore how to implement different resilience policies using Polly with concrete examples in .NET 10.
1. Retry Policy
The Retry policy allows you to automatically retry an operation a specified number of times after a failure. This is useful for handling transient faults such as network glitches.
Example: Retrying an HTTP request
using Polly;
using Polly.Retry;
using System;
using System.Net.Http;
using System.Threading.Tasks;
public class RetryExample
{
private static readonly HttpClient _httpClient = new HttpClient();
public static async Task Run()
{
AsyncRetryPolicy retryPolicy = Policy
.Handle() // Retry on HttpRequestExceptions
.WaitAndRetryAsync(
retryCount: 3, // Retry 3 times
sleepDurationProvider: (attempt) => TimeSpan.FromSeconds(Math.Pow(2, attempt)), // Exponential backoff
onRetry: (exception, timespan, retryAttempt, context) =>
{
Console.WriteLine($"Retry #{retryAttempt} due to: {exception.Message}. Waiting {timespan} seconds.");
});
try
{
string result = await retryPolicy.ExecuteAsync(async () =>
{
HttpResponseMessage response = await _httpClient.GetAsync("https://example.com/api/data");
// Simulate a transient error every other attempt
if(new Random().Next(0,2) == 0)
{
throw new HttpRequestException("Simulated Transient Error");
}
response.EnsureSuccessStatusCode(); // Throw exception for non-success status codes
return await response.Content.ReadAsStringAsync();
});
Console.WriteLine($"Successfully retrieved data: {result}");
}
catch (Exception ex)
{
Console.WriteLine($"Failed to retrieve data after multiple retries: {ex.Message}");
}
}
}
// To Run the Example
// await RetryExample.Run();
Explanation:
Policy.Handle<HttpRequestException>()
: This specifies that the retry policy should handleHttpRequestException
exceptions.WaitAndRetryAsync(retryCount, sleepDurationProvider, onRetry)
: This configures the retry policy with the following parameters:retryCount
: The number of times to retry the operation.sleepDurationProvider
: A function that determines the duration to wait between retries. In this example, we use an exponential backoff strategy.onRetry
: An action that is executed before each retry attempt. It provides information about the exception, the retry attempt number, and the duration to wait.
ExecuteAsync()
: This executes the operation with the specified retry policy.
2. Circuit Breaker Policy
The Circuit Breaker policy prevents an application from repeatedly trying to execute an operation that is likely to fail. After a certain number of failures, the circuit breaker “opens,” and subsequent calls are immediately failed without attempting the operation. After a specified duration, the circuit breaker enters a “half-open” state, allowing a single attempt to execute the operation. If the attempt succeeds, the circuit breaker closes; otherwise, it remains open.
Example: Implementing a circuit breaker for database access
using Polly;
using Polly.CircuitBreaker;
using System;
using System.Threading.Tasks;
public class CircuitBreakerExample
{
private static int _failureCount = 0;
public static async Task Run()
{
AsyncCircuitBreakerPolicy circuitBreakerPolicy = Policy
.Handle() // Handle any exception
.CircuitBreakerAsync(
exceptionsAllowedBeforeBreaking: 3, // Allow 3 exceptions before breaking the circuit
durationOfBreak: TimeSpan.FromSeconds(30), // Break the circuit for 30 seconds
onBreak: (exception, timespan, context) =>
{
Console.WriteLine($"Circuit broken due to: {exception.Message}. Circuit will remain open for {timespan}.");
},
onReset: (context) =>
{
Console.WriteLine("Circuit reset.");
},
onHalfOpen: () =>
{
Console.WriteLine("Circuit half-opened. Attempting a trial operation.");
});
for (int i = 0; i < 10; i++)
{
try
{
string result = await circuitBreakerPolicy.ExecuteAsync(async () =>
{
// Simulate database access that sometimes fails
if (_failureCount < 5)
{
_failureCount++;
throw new Exception("Database connection error.");
}
return "Data retrieved from database.";
});
Console.WriteLine($"Operation successful: {result}");
}
catch (BrokenCircuitException)
{
Console.WriteLine("Circuit is open. Operation not attempted.");
}
catch (Exception ex)
{
Console.WriteLine($"Operation failed: {ex.Message}");
}
await Task.Delay(TimeSpan.FromSeconds(5));
}
}
}
// To Run the Example
// await CircuitBreakerExample.Run();
Explanation:
CircuitBreakerAsync(exceptionsAllowedBeforeBreaking, durationOfBreak, onBreak, onReset, onHalfOpen)
: This configures the circuit breaker policy with the following parameters:exceptionsAllowedBeforeBreaking
: The number of exceptions that are allowed before the circuit breaker opens.durationOfBreak
: The duration for which the circuit breaker remains open.onBreak
: An action that is executed when the circuit breaker opens.onReset
: An action that is executed when the circuit breaker resets (closes).onHalfOpen
: An action that is executed when the circuit breaker enters the half-open state.
BrokenCircuitException
: This exception is thrown when you attempt to execute an operation while the circuit is open.
3. Timeout Policy
The Timeout policy aborts an operation if it exceeds a specified duration. This prevents long-running operations from blocking resources and degrading application performance.
Example: Setting a timeout for an API call
using Polly;
using Polly.Timeout;
using System;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
public class TimeoutExample
{
private static readonly HttpClient _httpClient = new HttpClient();
public static async Task Run()
{
AsyncTimeoutPolicy timeoutPolicy = Policy
.TimeoutAsync(
TimeSpan.FromSeconds(10), // Timeout after 10 seconds
TimeoutStrategy.Optimistic, // Optimistic timeout (throws TimeoutRejectedException)
onTimeoutAsync: (context, timespan, task) =>
{
Console.WriteLine($"Timeout occurred after {timespan} for operation: {context.OperationKey}.");
return Task.CompletedTask;
});
try
{
string result = await timeoutPolicy.ExecuteAsync(async (ct) =>
{
// Simulate a long-running API call
await Task.Delay(TimeSpan.FromSeconds(15), ct); // Simulate 15 second delay
HttpResponseMessage response = await _httpClient.GetAsync("https://example.com/api/data", ct);
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
}, CancellationToken.None); // Pass a CancellationToken
Console.WriteLine($"Successfully retrieved data: {result}");
}
catch (TimeoutRejectedException)
{
Console.WriteLine("Operation timed out.");
}
catch (Exception ex)
{
Console.WriteLine($"Operation failed: {ex.Message}");
}
}
}
// To Run the Example
// await TimeoutExample.Run();
Explanation:
TimeoutAsync(TimeSpan, TimeoutStrategy, onTimeoutAsync)
: This configures the timeout policy with the following parameters:TimeSpan
: The timeout duration.TimeoutStrategy
: The strategy to use when a timeout occurs.Optimistic
throws aTimeoutRejectedException
.Pessimistic
relies on the providedCancellationToken
.onTimeoutAsync
: An action that is executed when a timeout occurs.
TimeoutRejectedException
: This exception is thrown when the timeout expires andTimeoutStrategy.Optimistic
is used.- It's crucial to pass a
CancellationToken
to asynchronous operations to allow them to be cancelled when a timeout occurs.
4. Bulkhead Isolation Policy
The Bulkhead Isolation policy limits the number of concurrent calls to a resource, preventing it from being overwhelmed. This can be useful for protecting against cascading failures in microservices architectures.
Example: Limiting concurrent database connections
using Polly;
using Polly.Bulkhead;
using System;
using System.Threading.Tasks;
public class BulkheadExample
{
private static readonly AsyncBulkheadPolicy _bulkheadPolicy = Policy
.BulkheadAsync(
maxParallelization: 3, // Allow a maximum of 3 concurrent calls
maxQueuingActions: 2, // Allow a maximum of 2 queued calls
onRejected: (context) =>
{
Console.WriteLine($"Bulkhead rejected execution for operation: {context.OperationKey}.");
});
public static async Task Run()
{
var tasks = new Task[6]; // Create 6 tasks
for (int i = 0; i < tasks.Length; i++)
{
int taskNumber = i + 1;
tasks[i] = Task.Run(async () =>
{
try
{
Console.WriteLine($"Task {taskNumber}: Entering Bulkhead.");
string result = await _bulkheadPolicy.ExecuteAsync(async () =>
{
Console.WriteLine($"Task {taskNumber}: Executing operation.");
await Task.Delay(TimeSpan.FromSeconds(2)); // Simulate database operation
return $"Task {taskNumber}: Data retrieved from database.";
});
Console.WriteLine($"Task {taskNumber}: Operation successful: {result}");
}
catch (BulkheadRejectedException)
{
Console.WriteLine($"Task {taskNumber}: Bulkhead rejected execution.");
}
catch (Exception ex)
{
Console.WriteLine($"Task {taskNumber}: Operation failed: {ex.Message}");
}
finally
{
Console.WriteLine($"Task {taskNumber}: Leaving Bulkhead.");
}
});
}
await Task.WhenAll(tasks);
}
}
// To Run the Example
// await BulkheadExample.Run();
Explanation:
BulkheadAsync(maxParallelization, maxQueuingActions, onRejected)
: This configures the bulkhead policy with the following parameters:maxParallelization
: The maximum number of concurrent calls allowed.maxQueuingActions
: The maximum number of calls that can be queued while the bulkhead is full. Calls beyond this limit will be rejected.onRejected
: An action that is executed when a call is rejected.
BulkheadRejectedException
: This exception is thrown when the bulkhead rejects a call.
5. Cache Policy
The Cache policy caches the results of operations to reduce the load on downstream services. This is particularly useful for operations that are expensive to execute or that return data that does not change frequently.
Example: Caching API responses
using Polly;
using Polly.Cache;
using Microsoft.Extensions.Caching.Memory;
using System;
using System.Net.Http;
using System.Threading.Tasks;
public class CacheExample
{
private static readonly HttpClient _httpClient = new HttpClient();
private static readonly IMemoryCache _memoryCache = new MemoryCache(new MemoryCacheOptions());
private static readonly AsyncCachePolicy _cachePolicy = Policy.CacheAsync(
_memoryCache.AsAsyncCacheProvider(),
TimeSpan.FromSeconds(30), // Cache duration: 30 seconds
onCacheGet: (context) =>
{
Console.WriteLine($"Cache Hit: Operation {context.OperationKey} retrieved from cache.");
},
onCacheMiss: (context) =>
{
Console.WriteLine($"Cache Miss: Operation {context.OperationKey} not found in cache.");
},
onCachePut: (context) =>
{
Console.WriteLine($"Cache Put: Operation {context.OperationKey} stored in cache.");
}
);
public static async Task Run()
{
for (int i = 0; i < 3; i++)
{
try
{
string result = await _cachePolicy.ExecuteAsync(async (context, token) =>
{
Console.WriteLine("Executing operation...");
// Simulate an API call
await Task.Delay(TimeSpan.FromSeconds(5)); // Simulate 5 second delay
HttpResponseMessage response = await _httpClient.GetAsync("https://example.com/api/data");
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
}, new Context("GetData"));
Console.WriteLine($"Successfully retrieved data: {result}");
}
catch (Exception ex)
{
Console.WriteLine($"Operation failed: {ex.Message}");
}
await Task.Delay(TimeSpan.FromSeconds(10)); // Delay between calls to observe caching
}
}
}
// To Run the Example
// await CacheExample.Run();
Explanation:
- We are using
Microsoft.Extensions.Caching.Memory
to provide an in-memory cache store.
Alternatives include Redis cache providers for distributed caching. Policy.CacheAsync(_memoryCache.AsAsyncCacheProvider(), TimeSpan.FromSeconds(30), onCacheGet, onCacheMiss, onCachePut)
:_memoryCache.AsAsyncCacheProvider()
: Uses the in-memory cache as a caching provider for Polly.TimeSpan.FromSeconds(30)
: Sets the cache duration to 30 seconds.onCacheGet
,onCacheMiss
,onCachePut
: Optional actions to execute on cache hits, misses, and puts, respectively. These can be useful for logging or monitoring cache behavior.
Context("GetData")
: Associates a key "GetData" with the cached operation. This key is used to identify the cached data.
6. Fallback Policy
The Fallback policy provides a default or alternative response when an operation fails. This can be used to gracefully degrade functionality or provide a more user-friendly error message.
Example: Providing a default response when an API call fails
using Polly;
using Polly.Fallback;
using System;
using System.Net.Http;
using System.Threading.Tasks;
public class FallbackExample
{
private static readonly HttpClient _httpClient = new HttpClient();
public static async Task Run()
{
AsyncFallbackPolicy fallbackPolicy = Policy
.Handle() // Handle HttpRequestExceptions
.FallbackAsync(
fallbackValue: "Default data", // Return "Default data" on failure
onFallbackAsync: (outcome, context, token) =>
{
Console.WriteLine($"Fallback executed due to: {outcome.Exception?.Message ?? outcome.Result}.");
return Task.CompletedTask;
});
try
{
string result = await fallbackPolicy.ExecuteAsync(async () =>
{
// Simulate an API call that might fail
HttpResponseMessage response = await _httpClient.GetAsync("https://example.com/api/data");
response.EnsureSuccessStatusCode(); // Throw exception for non-success status codes
return await response.Content.ReadAsStringAsync();
});
Console.WriteLine($"Successfully retrieved data: {result}");
}
catch (Exception ex)
{
Console.WriteLine($"Operation failed: {ex.Message}");
}
}
}
// To Run the Example
// await FallbackExample.Run();
Explanation:
Policy<string>.Handle<HttpRequestException>()
: Specifies that the fallback policy should handleHttpRequestException
exceptions and the return type is a string.FallbackAsync(fallbackValue, onFallbackAsync)
: This configures the fallback policy with the following parameters:fallbackValue
: The default value to return when the operation fails.onFallbackAsync
: An action that is executed when the fallback policy is triggered. Provides access to the originalOutcome
(the result or exception), theContext
, and aCancellationToken
.
Composing Policies for Greater Resilience
Polly's real power comes from its ability to compose multiple policies together. This allows you to create sophisticated resilience strategies that address a variety of potential failures.
Example: Combining Retry and Circuit Breaker Policies
using Polly;
using Polly.Retry;
using Polly.CircuitBreaker;
using System;
using System.Net.Http;
using System.Threading.Tasks;
public class CombinedPolicyExample
{
private static readonly HttpClient _httpClient = new HttpClient();
public static async Task Run()
{
AsyncRetryPolicy retryPolicy = Policy
.Handle()
.WaitAndRetryAsync(3, attempt => TimeSpan.FromSeconds(Math.Pow(2, attempt)));
AsyncCircuitBreakerPolicy circuitBreakerPolicy = Policy
.Handle()
.CircuitBreakerAsync(2, TimeSpan.FromSeconds(10));
// Combine the Retry and Circuit Breaker policies
IAsyncPolicy combinedPolicy = Policy.WrapAsync(retryPolicy, circuitBreakerPolicy);
try
{
string result = await combinedPolicy.ExecuteAsync(async () =>
{
HttpResponseMessage response = await _httpClient.GetAsync("https://example.com/api/data");
response.EnsureSuccessStatusCode();
return await response.Content.ReadAsStringAsync();
});
Console.WriteLine($"Successfully retrieved data: {result}");
}
catch (Exception ex)
{
Console.WriteLine($"Operation failed: {ex.Message}");
}
}
}
// To Run the Example
// await CombinedPolicyExample.Run();
Explanation:
Policy.WrapAsync(retryPolicy, circuitBreakerPolicy)
: This combines the retry and circuit breaker policies. TheretryPolicy
is executed *inside* thecircuitBreakerPolicy
. This means that if an operation fails, the retry policy will attempt to retry it. If the operation continues to fail, and the circuit breaker opens, then no further retries will be attempted until the circuit breaker closes again.- Policy wrapping can be chained to combine multiple policies. The order of wrapping matters, as it affects the execution order.
Best Practices for Using Polly in .NET 10 Applications
To effectively leverage Polly and build resilient .NET 10 applications, consider the following best practices:
- Identify Critical Operations: Focus on protecting the most critical operations in your application. Prioritize operations that are essential for user experience and business functionality.
- Choose Appropriate Policies: Select the appropriate resilience policies based on the specific types of failures you are trying to handle. Retry policies are suitable for transient faults, while circuit breaker policies are useful for preventing cascading failures.
- Configure Policies Carefully: Carefully configure the parameters of your resilience policies, such as retry counts, timeout durations, and circuit breaker thresholds. Incorrectly configured policies can be ineffective or even detrimental. Tune these parameters based on the characteristics of the services you are interacting with.
- Use Logging and Monitoring: Implement logging and monitoring to track the behavior of your resilience policies. This will help you understand how your application is responding to failures and identify areas for improvement. Polly provides events like `onRetry`, `onBreak`, `onReset`, and `onTimeout` that can be used for logging and monitoring.
- Test Your Resilience Strategies: Thoroughly test your resilience strategies to ensure that they are working as expected. Simulate different types of failures and observe how your application responds. Consider using tools like Chaos Engineering to inject faults into your system and validate your resilience measures.
- Avoid Over-Wrapping: While composing policies is powerful, avoid over-wrapping policies. Too many layers of policies can make your code complex and difficult to understand. Keep it simple and focused on the specific resilience requirements of each operation.
- Use Asynchronous Operations: Polly works best with asynchronous operations. Use
async
andawait
keywords to perform non-blocking operations and avoid blocking threads. - Consider Using Polly Registry: For larger applications with many policies, consider using the `Polly.Registry` to manage and reuse policies. This can help to reduce code duplication and improve maintainability.
- Implement Health Checks: Integrate health checks into your .NET 10 application to monitor the health of its dependencies. Use the health check results to inform your resilience strategies. For example, you could use a circuit breaker to prevent calls to a failing service that is reporting an unhealthy status.
Conclusion
Building resilient .NET 10 applications is essential in today's complex and distributed environments. Polly provides a powerful and flexible toolkit for handling transient faults and implementing resilience strategies. By understanding the different types of policies, configuring them appropriately, and following best practices, you can build applications that are more robust, reliable, and user-friendly. Embrace Polly and start building more resilient .NET 10 applications today!
```