Recently, I helped a friend to deal with the problem of server collapse, and found a very interesting phenomenon: obviously on the high defense CDN, the source station is still three days by the traffic washed out. Checked half a day to find that their back to the source configuration is simply to the server "dig pit" - attack traffic is carried by the CDN, but the back to the source policy set up errors, but let the source station is normal traffic + cache penetration hit a surprise.
These days, even high-defense CDNs have to "prevent teammates". You think that if you buy a branded service, you can rest assured that the default configuration of CDN5 back to the source of the frequency is ridiculously high, CDN07 caching rules will even send dynamic requests back - these details are not optimized, the equivalent of leaving a back door to the attacker. I have tested an e-commerce platform, after adjusting the back to the source strategy, the bandwidth cost directly dropped 40%, the source server load from the peak 80% down to 30% below.
The nature of return bandwidth is a 'cost black hole'The CDN is a very important part of the system, and it is a very important part of the system. Many people think that CDN is just acceleration and protection, but ignored the return link is the real place to burn money. Especially in unexpected traffic scenarios, CDN nodes cache hit rate once the decline, back to the source of the request will be like an avalanche of overwhelmed servers. Don't look at CDN vendors to publicize "unlimited protection", back to the source bandwidth is a real billing by volume - 08Host's bill, back to the source of the traffic costs can account for the total cost of 60%, this number is scary enough, right?
Let's start by breaking down a typical misconception:Blindly turning on 'full path caching'The following are some examples of how you can use the CDN's default caching rules to save time. Some webmasters, in order to save trouble, directly apply the CDN default caching rules, the result is that even the user login status, API interface are cached. The user's data will be messed up, and the authentication will be invalidated. Last year, a financial platform data leakage accident, the root cause is the CDN to contain Token request cache after many times back to the source, equivalent to the key inserted in the door who can take.
My advice:Replacing Global Caching with Conditional Caching PoliciesFor example, add the following judgment logic to the Nginx layer. For example, add the following judgment logic to the Nginx layer to enable acceleration only for static resources and cacheable content, with dynamic requests resolutely not passed through:
And to top it all off.Return Protocol Selection. Some teams try to save time by letting the CDN use HTTP back to the source, but the client to the CDN is HTTPS - any link in the intermediate chain is hijacked and the data is naked. I have tested the default configuration of CDN07, and the return protocol actually inherits the client protocol, while CDN5 is forced to be configurable. It is highly recommended to enable "HTTPS full encryption", even if the source certificate is self-signed it is still better than plaintext.
Bandwidth optimization centers on cache hit rate improvementThe first is that the cache hit rate is only 62%. To give a test case: a video site initially with CDN5 default settings, cache hit rate of only 62%, back to the source of the daily traffic of nearly 3TB. later made three adjustments: one is based on the file type hierarchical caching (video file caching for 30 days, the page fragment caching for 2 hours), the second is to open the Range request segmentation back to the source, and the third is to set up the 404 page caching for 5 minutes to prevent the penetration. After adjusting the hit rate soared to 91%, back to the source traffic directly cut to 800GB/day.
Here we share a pressure tested and verified configuration template for most web sites:
Don't forget.Edge Computing Preprocessing. Now like 08Host's advanced version already supports edge node execution JavaScript, can put the picture compression, HTML minimization of these CPU-consuming operations on the edge of the processing. Before dealing with a forum site, picture WebP conversion all dumped to the CDN node, the source station bandwidth immediately dropped 70%, even the CPU load has dropped by half an order of magnitude.
Speaking of anti-attack strategies.Don't believe that "smart protection" can fix everything.The first thing you need to do is to get the information you need to get back to the source. Many CC attacks will simulate normal user behavior to gradually increase the frequency of requests, at this time if the return to the source strategy is too loose, the attack traffic will bypass the slow slow storage directly hit the source station. It is recommended to set up a hierarchical challenge mechanism in the CDN layer: low-frequency requests are released directly, medium-frequency JavaScript challenges are triggered, and high-frequency IPs are forced to delay the response. Tested this set of rules on CDN07 blocked 98% camouflage request.
Another tip is toLeveraging CDN Vendor DifferentiatorsFor example, CDN5 is very good at optimizing video streaming. For example, CDN5 is extremely well optimized for video streaming, supporting slice pre-pull and adaptive code rate back to the source; 08Host is good at global scheduling, can dynamically adjust the source node according to the load of the source station back to the source. Mixed use of multiple service providers, although the configuration is complex, but really can avoid a single point of bottleneck - I use CDN5 to handle static resources in the financial project, 08Host to carry API traffic, this optimization alone saves a few thousand dollars per month in bandwidth costs.
One last pitfall reminder:Monitoring metrics to see through three layers. Just look at the CDN console hit rate is not enough, must be combined with the source station logs to analyze the characteristics of the request back to the source. There was a case, CDN hit rate of 85% everything is normal, but the source station load is high. Later found that a crawler crazy request cold commodity page, and caching rules set up errors lead to each request back to the source. Later added a "404 page cache 5 minutes" rule, instantly solved the problem.
At the end of the day, a high-defense CDN is not a plug-and-play safe. Behind the return strategy involves cache dynamics, protocol optimization, attack psychology (yes, attackers will also study your configuration habits). It is recommended to do a quarterly back to source audit: capture the source station logs to analyze TOP back to source requests, use pressure testing tools to simulate burst traffic, and update cache rules to match business changes. These things are not done in place, buy more expensive CDN is also a psychological comfort.
Now go through your CDN console - is it still using the default configuration? Quickly adjust the back to source timeout from 30 seconds to 5 seconds, remove the Query parameter from the cache key, and add a 'not cached' flag to the login page. All these little things can be the key chips for your server to survive the next traffic storm.

