Incidents | Adrastia Incidents reported on status page for Adrastia https://status.adrastia.io/ https://d1lppblt9t2x15.cloudfront.net/logos/9aa961793941f89182ee6d27e73ed0e7.png Incidents | Adrastia https://status.adrastia.io/ en Automatos - Polygon ZkEVM recovered https://status.adrastia.io/ Thu, 07 Aug 2025 20:03:45 +0000 https://status.adrastia.io/#e1f39a43c93954ee3832ad01f58ded43ea237a0b03d1549c496f551f2ecca3b3 Automatos - Polygon ZkEVM recovered Automatos - Polygon ZkEVM went down https://status.adrastia.io/ Thu, 07 Aug 2025 20:03:36 +0000 https://status.adrastia.io/#e1f39a43c93954ee3832ad01f58ded43ea237a0b03d1549c496f551f2ecca3b3 Automatos - Polygon ZkEVM went down World Chain Chainlink Data Streams Updater Downtime https://status.adrastia.io/incident/624082 Mon, 21 Jul 2025 21:11:00 -0000 https://status.adrastia.io/incident/624082#3f5053502f6383361f76cd1f72ef232b0fc08dd3e40bdc53cd6d45a251c21d1c ## Summary Two brief periods of downtime occurred on **July 21, 2025**, affecting Chainlink Data Streams feed updates on **World Chain**. ## Timeline - **14:02:59 to 14:06:33 PDT** — 3 minutes 34 seconds - **14:07:17 to 14:11:49 PDT** — 4 minutes 32 seconds ## Impact During these windows, no Data Streams updates were successfully submitted for affected feeds. The missed updates did **not** result in unjust liquidations or other downstream protocol issues. ## Root Cause A **backward skew of ~5 seconds** in the blockchain clock caused submitted reports to have a `validFromTimestamp` later than the block timestamp, leading to reverts. This same issue also impacted **Chainlink’s official transmitter** during the affected period. ## Resolution Once the clock skew resolved, feed updates resumed automatically. A full investigation confirmed the cause and verified the absence of negative downstream effects. ## Follow-Up Actions - **Blockchain clock skew detection and handling** has been implemented in our infrastructure to prevent recurrence. World Chain Chainlink Data Streams Updater Downtime https://status.adrastia.io/incident/624082 Mon, 21 Jul 2025 21:02:00 -0000 https://status.adrastia.io/incident/624082#4d971bb4a4fa7b78a601cc2e21498ec955b846ec57ba2668048e9aa60ff74d6a Start of downtime. adrastia.io recovered https://status.adrastia.io/ Wed, 14 May 2025 00:15:24 +0000 https://status.adrastia.io/#279f4925256d75fa6212c6b9a8fb179950b0b36e87cd9979200f9bd9f2ac2430 adrastia.io recovered adrastia.io went down https://status.adrastia.io/ Wed, 14 May 2025 00:13:53 +0000 https://status.adrastia.io/#279f4925256d75fa6212c6b9a8fb179950b0b36e87cd9979200f9bd9f2ac2430 adrastia.io went down Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Sun, 08 Dec 2024 05:43:00 -0000 https://status.adrastia.io/incident/474190#9c21b1f43c718875fa02ddf3f99bbcdec09052f3afcca1e1d9ce9486e8bcdd13 # Post Mortem: Automatos Primary Worker Offline ## What happened 1. The primary worker node became unresponsive at about Dec 05 at 10:40am UTC. 2. Our team was notified at 11:08am UTC and responded immediately. 3. After a brief issue diagnosis, the primary worker was manually rebooted and came back online and fully functional at 11:17 UTC. ## Why this happened 1. Our 3rd logging party service used for verbose debugging, BetterStack, went down briefly, resulting in network errors. We log network errors, but the amount of data being sent was more than BetterStack permits, resulting in an infinite cycle of errors. Eventually, the server ran out of memory as its network buffers grew, causing it to crash. ## Estimated costs 1. The backup workers took over at the cost of short delays in submitting transactions. ## How to prevent this in the future 1. We dramatically reduced the verbosity of network error log reporting to prevent such issues. 2. We have suspended using BetterStack log reporting, until investigation to avoid and handle error 429s is concluded. Note that our Datadog log system still records all vital information. 3. We created an alert for network throughput changes so that we can closely monitor irregularities, being alerted before downtime occurs. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 11:17:00 -0000 https://status.adrastia.io/incident/474190#b7178979d564d828a26677847bed66f5b7da5d674996b37dd7d82f75030173ec The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/474190 Thu, 05 Dec 2024 10:40:00 -0000 https://status.adrastia.io/incident/474190#7d1faa18b4e5cf8b382ec41b49b4a7249cc263d1ae3afe1b6490c43851950871 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:30:00 -0000 https://status.adrastia.io/incident/447941#280f508b552a42b2d3bdf820ebf3179b2c1e0c703e8285663b0861168ee0c22f The primary worker was manually rebooted and came back online. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos Primary Worker Offline https://status.adrastia.io/incident/447941 Mon, 21 Oct 2024 01:10:00 -0000 https://status.adrastia.io/incident/447941#018dd2be693e563a0d0f8b01710e5be9a2da3be0839b262c6cb2795c01960c13 The primary Automatos worker node went offline. Automatos - Sei recovered https://status.adrastia.io/ Thu, 19 Sep 2024 19:34:33 +0000 https://status.adrastia.io/#758d4d29285dc0e9e997886b744508b117ec644fc7a4c5da508fc839288e8517 Automatos - Sei recovered Automatos Sei Downtime Post Mortem https://status.adrastia.io/incident/431932 Thu, 19 Sep 2024 19:34:00 -0000 https://status.adrastia.io/incident/431932#298409233b0dc64bb397ebe1754a87d6a9c6b8387c97629611a43e326beb8ff9 ## What happened - There was a brief 22-minute period where Sei connectivity was down: 1. Between 12:12 PDT and 12:34 PDT, September 19, 2024. ## Why this happened - All of our RPC servers for Sei went down. ## Estimated costs - Automatos was down on Sei for about 22 minutes. - The estimated cost of this downtime is zero. ## How to prevent this in the future - We've added an additional RPC provider for Sei. Automatos - Sei went down https://status.adrastia.io/ Thu, 19 Sep 2024 19:12:45 +0000 https://status.adrastia.io/#758d4d29285dc0e9e997886b744508b117ec644fc7a4c5da508fc839288e8517 Automatos - Sei went down Automatos Mode Downtime Post Mortem https://status.adrastia.io/incident/403785 Fri, 26 Jul 2024 23:40:00 -0000 https://status.adrastia.io/incident/403785#e1ccfa1962ee9fa6ae53d1c3816384efa68c0c0fae6c4118e019006fba6c76e3 ## What happened - There was a brief 7-minute period where Mode transactions failed to be submitted: 1. Between 16:35 PST and 16:43 PST, July 25, 2024. ## Why this happened - Some of our RPC providers went down. - Other RPC providers lagged behind the latest block. ## Estimated costs - Automatos was down on Mode for about 7 minutes. - Adrastia Oracles suffered minor staleness: 1. Ionic's utilization and error oracle had one update that was delayed by up to 7 minutes. ## How to prevent this in the future - We'll explore additional RPC providers, if available. Web App Rootstock Connectivity Downtime Post Mortem https://status.adrastia.io/incident/397934 Sat, 13 Jul 2024 23:59:00 -0000 https://status.adrastia.io/incident/397934#368806d8bb54af463ba39bbbc243feedaa129fec77c22e5e42dd1bc47f76b9c4 ## What happened - At 11:32 PM PST, all of our web app's Rootstock RPC providers become unavailable. - At 11:37 PM PST, we were notified of downtime and responded immediately. - At 11:47PM PST, our fix went live. ## Why this happened - Our web app was only using 2 RPC providers for Rootstock, and they both went down. ## Estimated costs - Users were unable to view Automatos worker balances for Rootstock via our web app for about 15 minutes. This did not negatively affect anyone. ## How to prevent this in the future - We've already added another RPC provider for Rootstock to our web app, DRPC. Automatos Arbitrum One Downtime Post Mortem https://status.adrastia.io/incident/387215 Thu, 20 Jun 2024 21:52:00 -0000 https://status.adrastia.io/incident/387215#86c79dc18353648e673861884d36b8b7d7b00101244baef1ce91fc41a8b955ed ## What happened - There were two periods where many Arbitrum One transactions failed to be submitted: 1. Between 04:27 PST and 06:00 PST. 2. Between 09:59 PST and 11:04 PST. ## Why this happened - Severe network congesion led to extreme gas costs — about 2000 times more costly than usual. - Our workers used hardcoded gas limits of 20M, meanwhile average transactions were consuming up to 1B gas since the network uses dynamic gas accounting. Therefore, our worker transactions were rejected due to insufficient gas. ## Estimated costs - Automatos was down on Arbitrum One for about 2 hours and 38 minutes. - Adrastia Oracles suffered minor staleness: 1. Between 04:27 PST and 06:00 PST: Up to about 0.5% price inaccuracy. 2. Between 09:59 PST and 11:04 PST: Up to about 0.1% price inaccuracy. ## How to prevent this in the future - We'll use dynamic gas estimations on Arbitrum One. - We'll improve our error detection and escalation processes to ensure high response times to incidents like these. Automatos Blast Downtime Post Mortem https://status.adrastia.io/incident/375719 Tue, 28 May 2024 05:03:00 -0000 https://status.adrastia.io/incident/375719#12b16807879bf4bd718cd0d18b7d8833e43e0434db51c0cbc43b999c7a6c4f28 ## What happened 1. At 9:50PM PST, all of our worker nodes lost blockchain connectivity. 2. At 9:54PM PST, we were notified of downtime and immediately responded. 3. At 10:03PM PST, we changed RPC configurations to bring Automatos back online. Incident resolved. ## Why this happened 1. All of the RPC providers we use with our Automatos workers went offline. ## Estimated costs 1. Automatos was down on Blast for about 13 minutes. 2. Automatos did not miss any work that needed to be performed. ## How to prevent this in the future 1. We'll add more backup RPC providers.