Feb 26, 2026 8 min read

Implementing Robust Rate Limiting in Your .NET APIs

At its core, rate limiting is about setting boundaries. It prevents any single client from overwhelming your system with an excessive number of requests.

Rate limiting is a crucial aspect of API development, essential for maintaining performance, ensuring fairness, and bolstering security. It acts as a gatekeeper, controlling the number of requests a client can make within a specific timeframe. This post dives deep into understanding and implementing various rate-limiting strategies within your .NET applications, ensuring your APIs remain stable and resilient.

💡

It you prefer a video guide, you may use the following lesson for reference.

Understanding the Fundamentals of Rate Limiting

At its core, rate limiting is about setting boundaries. It prevents any single client from overwhelming your system with too many requests. This is vital for several reasons:

Performance Preservation: High traffic spikes can bog down your servers, leading to slow response times or even complete outages. Rate limiting helps distribute the load more evenly.
Abuse Prevention: Malicious actors might attempt to exploit your API through brute-force attacks or denial-of-service (DoS) attempts. Rate limiting acts as a first line of defense.
Fair Usage: It ensures that all legitimate users have a reasonable chance to access your API without being drowned out by a few aggressive clients.

Common rate-limiting rules might include:

Limiting a specific user to a certain number of requests per minute (e.g., 100 requests/minute/user).
Blocking any requests that exceed a predefined threshold, regardless of the source.

Let's explore how to implement these concepts in a .NET environment.

Setting Up Rate Limiting in Your .NET Application

The journey to implementing rate limiting in a .NET application typically begins in the Program.cs file, where we configure services and middleware.

Registering the Rate Limiter Service

First, we need to register the rate-limiting services with the dependency injection container. This is a straightforward step:

builder.Services.AddRateLimiter();

While this line is simple, it requires further configuration to define the actual rate-limiting policies. We achieve this by using a delegate to configure the options.

Configuring Rate Limiting Options

The AddRateLimiter method accepts an options delegate that allows us to define various policies. These policies dictate how rate limiting is applied. Some of the available strategies include:

Fixed Window Limiter: A straightforward approach where requests are counted within fixed time intervals.
Sliding Window Limiter: A more dynamic method that tracks requests over a continuously moving time window.
Token Bucket Limiter: A model where "tokens" are added to a bucket at a constant rate. Requests consume tokens, and if the bucket is empty, requests are rejected.
Concurrency Limiter: This limits the number of concurrent requests that can be processed.

Let's start by implementing a Fixed Window Limiter.

Implementing a Fixed Window Limiter

To set up a fixed window policy, we give it a name and configure its options.

options.AddFixedWindowLimiter(policyName: "fixed", opt =>
{
    // Define the duration of the window
    opt.Window = TimeSpan.FromMinutes(1);
    // Set the maximum number of requests allowed within the window
    opt.PermitLimit = 100;
    // Configure queueing behavior when the limit is reached
    opt.QueueProcessingOrder = System.Threading.RateLimiting.QueueProcessingOrder.OldestFirst;
    // Define the maximum number of requests to queue
    opt.QueueLimit = 5;
});

In this example:

policyName: "fixed" is the identifier for this policy.
Window: Set to TimeSpan.FromMinutes(1), meaning the limit applies to requests made within a one-minute window.
PermitLimit: Set to 100, allowing a maximum of 100 requests per minute.
QueueProcessingOrder: Configures how queued requests are handled. OldestFirst is a common choice.
QueueLimit: Set to 5, meaning up to 5 requests can be queued if the PermitLimit is reached.

For testing purposes, we might reduce the PermitLimit to a smaller value, such as 5, to observe rate limiting more quickly.

Applying the Rate Limiter Middleware

After configuring the rate limiter service, we need to add its middleware to the application's request pipeline. The placement of this middleware is critical. It should generally be placed after middleware that might perform redirects, but before middleware that handles core application logic.

app.UseRateLimiter();

This ensures that the rate-limiting rules are checked early in the request processing flow.

Attaching Rate Limiting Policies to Controllers/Actions

Once the service and middleware are set up, you can apply specific rate-limiting policies to your controllers or individual actions using attributes.

For our "fixed" policy, we can enable it on a controller or action like this:

[EnableRateLimiting("fixed")]
public class MyController : ControllerBase
{
    // ... controller actions
}

Or on a specific action:

[HttpGet]
[EnableRateLimiting("fixed")]
public IActionResult GetData()
{
    // ... action logic
}

Testing the Fixed Window Limiter

When you test this setup, if you exceed the defined PermitLimit within the Window, you might initially see a timeout error. This is because the default behavior when a rate limit is hit might be to return a timeout. To provide clearer feedback to the client, we can configure the rejected status code.

Customizing Rejected Status Codes

Instead of a generic timeout, it's a better practice to return a standard HTTP status code indicating that the rate limit has been exceeded. The 429 Too Many Requests status code is the industry standard for this.

We can configure this within the AddRateLimiter options:

options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;

With this change, when the rate limit is hit, the client will receive a 429 response, often accompanied by a Retry-After header indicating how long they should wait before retrying. For quicker testing, you might temporarily reduce the Window or PermitLimit to observe this behavior.

Implementing Per-User Rate Limiting with Sliding Window

A more sophisticated approach is to implement rate limiting on a per-user basis. This is particularly useful for authenticated users, ensuring that each individual user adheres to the defined limits. The Sliding Window Limiter is w to this.

Configuring a Per-User Sliding Window Policy

To implement per-user rate limiting, we need to extract the user's identity from the HTTP context and use it as the limiter's partition key.

options.AddSlidingWindowLimiter(policyName: "perUser", opt =>
{
    opt.Window = TimeSpan.FromMinutes(1); // 1 minute window
    opt.PermitLimit = 50; // 50 requests per user per minute
    opt.SegmentsPerWindow = 6; // 6 segments, each 10 seconds long (60s / 6 = 10s)
    opt.QueueProcessingOrder = System.Threading.RateLimiting.QueueProcessingOrder.OldestFirst;
    opt.QueueLimit = 3; // 3 queued requests per user
});

Here's a breakdown:

policyName: "perUser" identifies this policy.
Window: Set to TimeSpan.FromMinutes(1).
PermitLimit: Set to 50, meaning a user can make up to 50 requests per minute.
SegmentsPerWindow: This divides the main window into smaller segments. With 6 segments for a 60-second window, each segment is 10 seconds long. This allows for more granular tracking within the sliding window.
QueueLimit: Set to 3, allowing a small buffer of queued requests.

To associate this policy with a user, you'd typically retrieve the user's identity from the HttpContext.

options.AddPolicy("perUser", httpContext =>
{
    var userId = httpContext.User.FindFirst(System.Security.Claims.ClaimTypes.NameIdentifier)?.Value;
    if (string.IsNullOrEmpty(userId))
    {
        // Handle anonymous users or return a default policy
        return new RateLimiterOptions
        {
            // ... default options ...
        };
    }

    return new SlidingWindowRateLimiterOptions
    {
        Window = TimeSpan.FromMinutes(1),
        PermitLimit = 50,
        SegmentsPerWindow = 6,
        QueueLimit = 3,
        QueueProcessingOrder = System.Threading.RateLimiting.QueueProcessingOrder.OldestFirst,
        // Partitioning by user ID
        PartitionKey = userId
    };
});

This configuration allows you to define limits that are specific to each authenticated user. For example, you might apply this policy to sensitive endpoints like order processing or account management.

Implementing Global Rate Limiting by IP Address

Beyond per-user limits, it's essential to have a global rate-limiting strategy to protect your API from widespread abuse, such as bot attacks or denial-of-service attempts. Limiting by IP address is a common and effective method for this.

Creating a Global IP-Based Rate Limiter

We can create a global policy that uses the client's IP address as the partition key. This policy can be applied broadly across your API.

options.AddPolicy("globalIp", httpContext =>
{
    var ipAddress = httpContext.Connection.RemoteIpAddress?.ToString();
    if (string.IsNullOrEmpty(ipAddress))
    {
        ipAddress = "unknown"; // Fallback for cases where IP might not be available
    }

    return new FixedWindowRateLimiterOptions
    {
        Window = TimeSpan.FromMinutes(1), // 1 minute window
        PermitLimit = 200, // Allow 200 requests per IP per minute
        QueueLimit = 5, // Allow 5 queued requests per IP
        QueueProcessingOrder = System.Threading.RateLimiting.QueueProcessingOrder.OldestFirst,
        // Partitioning by IP address
        PartitionKey = ipAddress
    };
});

In this global policy:

We extract the RemoteIpAddress from the HttpContext.
The PartitionKey is set to the IP address, meaning each unique IP address will have its own rate-limiting counter.
PermitLimit is set to 200, allowing 200 requests per minute per IP address. Note: These numbers are illustrative. You should choose values appropriate for your API's expected traffic and resource constraints. For instance, 200 requests per minute from a single IP might still be too high and could be reduced to 50 or 10, depending on your needs.

This global policy can be applied to all endpoints or specific groups of endpoints that are particularly vulnerable.

Applying Global Policies

Unlike per-user or specific endpoint policies, a "global" policy is often applied implicitly or via middleware that checks for rate-limiting configurations. If you've defined a policy named "globalIp" and registered it, you can then apply it broadly:

app.UseRateLimiter(new RateLimiterOptions
{
    // ... other global options ...
    DefaultPolicy = "globalIp" // Apply this policy by default if no specific policy is found
});

Alternatively, you can attach it using the [EnableRateLimiting] attribute to controllers or actions if you want more explicit control over where it's applied.

Other Rate Limiting Strategies

While Fixed Window and Sliding Window limiters are common, .NET's rate limiting library also supports:

Token Bucket Limiter: This model is excellent for allowing bursts of traffic. A bucket is filled with tokens at a steady rate. Each request consumes a token. If the bucket is empty, requests are rejected. This allows for periods of high activity followed by periods of lower activity, all within a defined average rate.
Concurrency Limiter: This directly limits the number of requests that can be actively processed at any given time, preventing your application from being overwhelmed by too many simultaneous operations.

Best Practices and Considerations

When implementing rate limiting, keep these points in mind:

Choose the Right Strategy: Select the rate-limiting strategy that best fits your application's needs. A fixed window is simple; a sliding window offers greater accuracy; and a token bucket handles bursts well.
Set Realistic Limits: Avoid setting limits too low, which can frustrate legitimate users, or too high, which defeats the purpose of rate limiting. Analyze your API's capacity and typical usage patterns.
Provide Clear Feedback: Always return the 429 Too Many Requests status code and include a Retry-After header to guide clients on when they can resubmit their requests.
Consider Partition Keys Carefully: The choice of partition key (user ID, IP address, API key, etc.) is crucial for effective rate limiting.
Monitor and Adjust: Rate limiting is not a set-and-forget feature. Monitor your API's performance and traffic patterns, and be prepared to adjust your limits and strategies as needed.

Conclusion

Implementing rate limiting is a vital step in building robust, secure, and performant .NET APIs. By understanding and leveraging available strategies, such as fixed-window, sliding-window, and IP-based limiting, you can effectively protect your application from abuse and ensure a stable experience for all your users. Experiment with these techniques, tune your limits based on your specific requirements, and keep your APIs running smoothly.

Want to Take This Further?

Paging is just one small but important part of building a production-ready ASP.NET Core Web API.

In my full course, Ultimate ASP.NET Core Web API Development Guide, we go far beyond paging and build a complete, enterprise-ready API using .NET 10 and modern best practices.

Inside the course, you will learn how to:

Design clean, RESTful APIs end-to-end
Combine filtering, sorting, and paging correctly
Structure services, repositories, and DTOs
Secure APIs with authentication and JWT
Add logging, documentation, caching, and versioning
Deploy APIs and databases to Microsoft Azure

👉 Enroll here:

Happy coding!