Fix panic when failing to create `Duration` for exponential backoff from a large float (#3621)
## Motivation and Context Avoids panic when [Duration for exponential backoff](https://github.com/smithy-lang/smithy-rs/blob/main/rust-runtime/aws-smithy-runtime/src/client/retries/strategy/standard.rs#L150) could not be created from a large float. ## Description [Duration::from_secs_f64](https://doc.rust-lang.org/core/time/struct.Duration.html#method.from_secs_f64) may panic. This PR switches to use a fallible sibling `Duration::try_from_secs_f64` to avoid panics. If `Duration::try_from_secs_f64` returns an `Err` we fallback to `max_backoff` for subsequent retries. Furthermore, we learned from internal discussion that jitter also needs to be applied to `max_backoff`. This PR updates `calculate_exponential_backoff` to handle all the said business logic in one place. ## Testing - Added a unit test `should_not_panic_when_exponential_backoff_duration_could_not_be_created` - Manually verified reproduction steps provided [in the original PR](https://github.com/awslabs/aws-sdk-rust/issues/1133) <details> <summary>More details</summary> ``` #[tokio::test] async fn repro_1133() { let config = aws_config::from_env() .region(Region::new("non-existing-region")) // forces retries .retry_config( RetryConfig::standard() .with_initial_backoff(Duration::from_millis(1)) .with_max_attempts(100), ) .timeout_config( TimeoutConfig::builder() .operation_attempt_timeout(Duration::from_secs(180)) .operation_timeout(Duration::from_secs(600)) .build(), ) .load() .await; let client: Client = Client::new(&config); let res = client .list_objects_v2() .bucket("bucket-name-does-not-matter") .send() .await; dbg!(res); } ``` Without changes in this PR: ``` ---- repro_1133 stdout ---- thread 'repro_1133' panicked at /rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/time.rs:741:23: can not convert float seconds to Duration: value is either too big or NaN stack backtrace: ... failures: repro_1133 test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 9 filtered out; finished in 338.18s ``` With changes in this PR: ``` // no panic ---- repro_1133 stdout ---- res = Err( TimeoutError( TimeoutError { source: MaybeTimeoutError { kind: Operation, duration: 600s, }, }, ), ) ``` </details> ## Checklist <!--- If a checkbox below is not applicable, then please DELETE it rather than leaving it unchecked --> - [x] I have updated `CHANGELOG.next.toml` if I made changes to the smithy-rs codegen or runtime crates - [x] I have updated `CHANGELOG.next.toml` if I made changes to the AWS SDK, generated SDK code, or SDK runtime crates ## Appendix <details> <summary>runtime-versioner bug fix</summary> This PR also fixes a limitation in `runtime-versioner audit`. I included the fix in the PR because the issue occurred with special conditions, and we don't get to reproduce it every day. The way the issue manifests is this. 1. We have a branch from the main whose latest release tag at the time was `release-2024-04-30` 2. The main has moved ahead and a new `smithy-rs` release has been made `release-2024-05-08` 3. We perform `git merge main`, pre-commit hooks run, and we then get `audit` failures from `runtime-versioner`: ``` 2024-05-10T16:54:36.434968Z WARN runtime_versioner::command::audit: there are newer releases since 'release-2024-04-30' aws-config was changed and version bumped, but the new version number (1.4.0) has already been published to crates.io. Choose a new version number. aws-runtime was changed and version bumped, but the new version number (1.2.2) has already been published to crates.io. Choose a new version number. aws-smithy-runtime-api was changed and version bumped, but the new version number (1.6.0) has already been published to crates.io. Choose a new version number. aws-smithy-types was changed and version bumped, but the new version number (1.1.9) has already been published to crates.io. Choose a new version number. aws-types was changed and version bumped, but the new version number (1.2.1) has already been published to crates.io. Choose a new version number. Error: there are audit failures in the runtime crates ``` This happens because when the latest `main` is being merged to our branch, `runtime-versioner audit` should use `previous_release_tag` `release-2024-05-08` to perform audit but [pre-commit hooks run the tool](https://github.com/smithy-lang/smithy-rs/blob/main/.pre-commit-hooks/runtime-versioner.sh#L8) using the latest previous release tag that can be traced back from `HEAD` of our branch, which is `release-2024-04-30`. Hence the error. The fix adds an environment variable `SMITHY_RS_RUNTIME_VERSIONER_AUDIT_PREVIOUS_RELEASE_TAG` to specify a previous release tag override, in addition to a `--previous-release-tag` command-line argument. Furthermore, the fix has relaxed certain checks in `audit`. Taking our example for instance, when `HEAD` is now behind `release-2024-05-08`, it's OK to fail even if `release-2024-05-08` is not the ancestor of `HEAD` (as stated above, `git merge-base --is-ancestor` does not know that while main is being merged) as long as `release-2024-04-28` (the latest release seen from `HEAD`) is the ancestor of `release-2024-05-08`. ``` release-2024-04-28 release-2024-05-08 │ │ ────────────┼───────────────┬────────────────┼───────x─────► main │ │HEAD │ x │ of x │branch x │ x │ x │ xxxxxxxxxxx │ x │ x git merge main │ x │ ▼ │ └──────────────────► feature branch ``` To use the fix, set the environment variable with the new release tag and perform `git merge main`: ``` ➜ smithy-rs git:(ysaito/fix-panic-in-exp-backoff) ✗ export SMITHY_RS_RUNTIME_VERSIONER_AUDIT_PREVIOUS_RELEASE_TAG=release-2024-05-08 ➜ smithy-rs git:(ysaito/fix-panic-in-exp-backoff) ✗ git merge main ... Running `/Users/awsaito/src/smithy-rs/tools/target/debug/runtime-versioner audit` 2024-05-10T19:32:26.665578Z WARN runtime_versioner::tag: expected previous release to be 'release-2024-04-30', but 'release-2024-05-08' was specified. Proceeding with 'release-2024-05-08'. SUCCESS ``` </summary> ---- _By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice._
Loading
Please register or sign in to comment