Ensure you have throttling in place. Client must honor exponential back-off policies for 429's and ensure you are doing retries as per the guidance below.
Divide your Key Vault traffic amongst multiple vaults and different regions. Use a separate vault for each security/availability domain. If you have five apps, each in two regions, then we recommend 10 vaults each containing the secrets unique to app and region. A subscription-wide limit for all transaction types is five times the individual key vault limit. For example, HSM-other transactions per subscription are limited to 5,000 transactions in 10 seconds per subscription. Consider caching the secret within your service or app to also reduce the RPS directly to key vault and/or handle burst based traffic. You can also divide your traffic amongst different regions to minimize latency and use a different subscription/vault. Do not send more than the subscription limit to the Key Vault service in a single Azure region.
Cache the secrets you retrieve from Azure Key Vault in memory, and reuse from memory whenever possible. Re-read from Azure Key Vault only when the cached copy stops working (e.g. because it got rotated at the source).
Key Vault is designed for your own services secrets. If you are storing your customers' secrets (especially for high-throughput key storage scenarios), consider putting the keys in a database or storage account with encryption, and storing just the master key in Azure Key Vault.
Encrypt, wrap, and verify public-key operations can be performed with no access to Key Vault, which not only reduces risk of throttling, but also improves reliability (as long as you properly cache the public key material).
If you use Key Vault to store credentials for a service, check if that service supports Azure AD Authentication to authenticate directly. This reduces the load on Key Vault, improves reliability and simplifies your code since Key Vault can now use the Azure AD token. Many services have moved to using Azure AD Auth. See the current list at Services that support managed identities for Azure resources.
Consider staggering your load/deployment over a longer period of time to stay under the current RPS limits.
If your app comprises multiple nodes that need to read the same secret(s), then consider using a fan out pattern, where one entity reads the secret from Key Vault, and fans out to all nodes. Cache the retrieved secrets only in memory.