tag:status.websolr.com,2005:/historyWebsolr Status - Incident History2024-03-28T15:47:41ZWebsolrtag:status.websolr.com,2005:Incident/162701172023-02-24T18:31:39Z2023-02-24T18:35:29ZHeroku Websolr add-on users experiencing provisioning failures<p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>18:31</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved. Heroku users will need to resend their provisioning requests. Heroku will also remove failed provisioning requests after 24 hours, so users will have to resend their provisioning requests.</p><p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>18:22</var> UTC</small><br><strong>Monitoring</strong> - A fix has been implemented and we are monitoring the results.</p><p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>17:25</var> UTC</small><br><strong>Identified</strong> - The issue has been identified and a fix is being implemented.</p><p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>17:00</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p>tag:status.websolr.com,2005:Incident/95971262022-03-21T21:59:26Z2022-03-21T21:59:26ZPlanned Maintenance<p><small>Mar <var data-var='date'>21</var>, <var data-var='time'>21:59</var> UTC</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Mar <var data-var='date'>21</var>, <var data-var='time'>20:40</var> UTC</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Mar <var data-var='date'>21</var>, <var data-var='time'>20:36</var> UTC</small><br><strong>Scheduled</strong> - We are rolling out an upgrade to our platform. This operation is expected to be zero downtime with no customer impact.</p>tag:status.websolr.com,2005:Incident/94003852022-02-24T17:47:46Z2022-02-24T17:49:59ZWebsolr.com is unresponsive<p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>17:47</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>17:15</var> UTC</small><br><strong>Monitoring</strong> - We are monitoring Heroku: https://status.heroku.com/</p><p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>17:10</var> UTC</small><br><strong>Identified</strong> - The issue has been identified. Websolr.com is dependent on Heroku, and there is a incident for Heroku: https://status.heroku.com/</p><p><small>Feb <var data-var='date'>24</var>, <var data-var='time'>16:34</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p>tag:status.websolr.com,2005:Incident/72547432021-06-15T18:00:44Z2021-06-15T18:00:45ZPlanned maintenance<p><small>Jun <var data-var='date'>15</var>, <var data-var='time'>18:00</var> UTC</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Jun <var data-var='date'>15</var>, <var data-var='time'>16:15</var> UTC</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Jun <var data-var='date'>15</var>, <var data-var='time'>16:03</var> UTC</small><br><strong>Scheduled</strong> - We are upgrading our routing layer. No chance of behavior or downtime should be expected</p>tag:status.websolr.com,2005:Incident/70404622021-05-19T16:07:05Z2021-05-19T16:07:05ZPlanned Maintenance<p><small>May <var data-var='date'>19</var>, <var data-var='time'>16:07</var> UTC</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>May <var data-var='date'>19</var>, <var data-var='time'>16:00</var> UTC</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>May <var data-var='date'>19</var>, <var data-var='time'>15:44</var> UTC</small><br><strong>Scheduled</strong> - We will be upgrading some infrastructure on the main application. During this time the Websolr dashboard will be unavailable. However, Solr services will not be impacted.</p>tag:status.websolr.com,2005:Incident/69830622021-05-12T20:07:52Z2021-05-14T18:01:40ZElevated HTTP 502 Errors in the EU-West Region<p><small>May <var data-var='date'>12</var>, <var data-var='time'>20:07</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>May <var data-var='date'>12</var>, <var data-var='time'>14:29</var> UTC</small><br><strong>Update</strong> - We continue to monitor the fix for any further issues. We will publish a postmortem once internal discussions are complete.</p><p><small>May <var data-var='date'>12</var>, <var data-var='time'>14:19</var> UTC</small><br><strong>Monitoring</strong> - A fix has been implemented and we are monitoring the results.</p><p><small>May <var data-var='date'>12</var>, <var data-var='time'>13:32</var> UTC</small><br><strong>Identified</strong> - We have identified the root cause and are working on a fix.</p><p><small>May <var data-var='date'>12</var>, <var data-var='time'>13:26</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p>tag:status.websolr.com,2005:Incident/41997422020-05-26T21:00:11Z2020-05-26T21:31:15ZCluster Metrics Unavailable<p><small>May <var data-var='date'>26</var>, <var data-var='time'>21:00</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>May <var data-var='date'>26</var>, <var data-var='time'>20:42</var> UTC</small><br><strong>Update</strong> - We are still working on a resolution, but most indices metrics have been completely restored and are up to date. About 12% of indices are still backfilling historical metrics since the start of the incident. A small number of customers will have a small gap in their metrics data.</p><p><small>May <var data-var='date'>26</var>, <var data-var='time'>18:12</var> UTC</small><br><strong>Update</strong> - We have deployed an initial fix and have some data for affected indices currently backfilling.</p><p><small>May <var data-var='date'>26</var>, <var data-var='time'>15:42</var> UTC</small><br><strong>Monitoring</strong> - A fix has been implemented and we are monitoring the results.</p><p><small>May <var data-var='date'>26</var>, <var data-var='time'>14:49</var> UTC</small><br><strong>Update</strong> - We are continuing to work on a fix for this issue.</p><p><small>May <var data-var='date'>26</var>, <var data-var='time'>14:48</var> UTC</small><br><strong>Identified</strong> - The issue has been identified and a fix is being implemented.</p>tag:status.websolr.com,2005:Incident/41720772020-05-22T17:41:22Z2020-05-22T17:51:20ZElevated 503 errors in US-West-1<p><small>May <var data-var='date'>22</var>, <var data-var='time'>17:41</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>May <var data-var='date'>22</var>, <var data-var='time'>17:13</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating the issue.</p>tag:status.websolr.com,2005:Incident/22468312019-03-07T17:11:16Z2019-03-07T17:12:42ZElevated errors in the US-East region<p><small>Mar <var data-var='date'> 7</var>, <var data-var='time'>17:11</var> UTC</small><br><strong>Resolved</strong> - The issue was traced to a partial outage in AWS. The root cause was a malfunction within a single AZ, which caused a handful of instances to erroneously be replaced by the ASG. This manifested in Websolr as indices suddenly returning HTTP 500-level errors. Users without replication additionally experienced some data loss. Further questions should be directed to support@websolr.com.</p><p><small>Mar <var data-var='date'> 7</var>, <var data-var='time'>16:03</var> UTC</small><br><strong>Investigating</strong> - We are investigating some issues affecting servers in a single availability zone. We are responding to issues as they arise, and are actively investigating the root cause. Automatic recovery is happening slowly, but steadily.</p>tag:status.websolr.com,2005:Incident/22466282019-03-07T15:44:42Z2019-03-07T15:44:42ZElevated 500-level errors in the US-East region<p><small>Mar <var data-var='date'> 7</var>, <var data-var='time'>15:44</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Mar <var data-var='date'> 7</var>, <var data-var='time'>15:17</var> UTC</small><br><strong>Monitoring</strong> - Impacted users should be seeing traffic and performance return to normal.</p><p><small>Mar <var data-var='date'> 7</var>, <var data-var='time'>14:36</var> UTC</small><br><strong>Update</strong> - We are continuing to work on a fix for this issue.</p><p><small>Mar <var data-var='date'> 7</var>, <var data-var='time'>14:36</var> UTC</small><br><strong>Identified</strong> - We have identified an issue affecting performance for small percentage of users with indices in US-East-1. We are working to resolve.</p>tag:status.websolr.com,2005:Incident/21988572019-02-08T21:52:43Z2019-02-08T21:52:43ZElevated latency detected in US-East<p><small>Feb <var data-var='date'> 8</var>, <var data-var='time'>21:52</var> UTC</small><br><strong>Resolved</strong> - We have upgraded some infrastructure in our load balancer, which should offer a significant improvement in performance across the region. We will continue to monitor over the weekend.</p><p><small>Feb <var data-var='date'> 8</var>, <var data-var='time'>19:10</var> UTC</small><br><strong>Identified</strong> - We have been alerted to an issue impacting one of our load balancers in the US-East-1 (Virginia) region. Our operations team has been notified and is currently working on a fix.</p>tag:status.websolr.com,2005:Incident/16835822018-04-11T22:47:15Z2018-04-11T22:47:15ZConnectivity errors for some indices in the Virginia region<p><small>Apr <var data-var='date'>11</var>, <var data-var='time'>22:47</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Apr <var data-var='date'>11</var>, <var data-var='time'>20:55</var> UTC</small><br><strong>Monitoring</strong> - A fix has been implemented and we are monitoring the results.</p><p><small>Apr <var data-var='date'>11</var>, <var data-var='time'>20:54</var> UTC</small><br><strong>Identified</strong> - The issue has been identified and a fix is being implemented.</p><p><small>Apr <var data-var='date'>11</var>, <var data-var='time'>20:53</var> UTC</small><br><strong>Investigating</strong> - We’re investigating reports of connectivity errors for some indices in the Virginia region.</p>tag:status.websolr.com,2005:Incident/15549162018-01-05T13:48:00Z2018-01-05T13:55:55ZSporadic connection timeouts in US-East<p><small>Jan <var data-var='date'> 5</var>, <var data-var='time'>13:48</var> UTC</small><br><strong>Resolved</strong> - The timeouts have been resolved.</p><p><small>Jan <var data-var='date'> 5</var>, <var data-var='time'>13:31</var> UTC</small><br><strong>Identified</strong> - We've identified an issue in US-East which is causing a small amount of connection timeouts. Remediation is currently in progress.</p>tag:status.websolr.com,2005:Incident/15107592017-11-20T22:26:05Z2017-11-20T22:26:05ZElevated 503s in us-east-1<p><small>Nov <var data-var='date'>20</var>, <var data-var='time'>22:26</var> UTC</small><br><strong>Resolved</strong> - We detected and resolved a networking issue which increased 503s for a subset of customers.</p>tag:status.websolr.com,2005:Incident/13925782017-10-02T18:33:55Z2017-10-02T18:33:55ZElevated error rates for some users in US-East<p><small>Oct <var data-var='date'> 2</var>, <var data-var='time'>18:33</var> UTC</small><br><strong>Resolved</strong> - We have identified and fixed the problem.</p>tag:status.websolr.com,2005:Incident/13451432017-09-01T09:15:36Z2017-09-01T09:15:36ZElevated error rates for some users in US-East<p><small>Sep <var data-var='date'> 1</var>, <var data-var='time'>09:15</var> UTC</small><br><strong>Resolved</strong> - Service has been restored.</p>tag:status.websolr.com,2005:Incident/13420282017-08-29T21:54:25Z2017-08-29T21:54:25ZElevated error rates in US-East<p><small>Aug <var data-var='date'>29</var>, <var data-var='time'>21:54</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Aug <var data-var='date'>29</var>, <var data-var='time'>19:45</var> UTC</small><br><strong>Identified</strong> - We are observing an elevated error rate in US-East. The regression is limited in scope, affecting <0.1% of all requests. We have identified a root cause and are working on a fix.</p>tag:status.websolr.com,2005:Incident/13154832017-08-03T00:45:54Z2017-08-03T00:45:55ZNetwork connectivity and availability issues for some indices in Virginia<p><small>Aug <var data-var='date'> 3</var>, <var data-var='time'>00:45</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Aug <var data-var='date'> 3</var>, <var data-var='time'>00:07</var> UTC</small><br><strong>Update</strong> - From AWS:
<br />
<br /><blockquote><p><i>4:58 PM PDT We can confirm that some instances are unreachable and some EBS volumes are experiencing degraded performance in a single Availability Zone in the US-EAST-1 Region. Engineers are engaged and we are working to resolve the issue.</i></p>
<br /><p><i>5:05 PM PDT We have identified the root cause and are beginning to see recovery for instances and EBS volumes in the affected Availability Zone in the US-EAST-1 Region. We continue to work toward full resolution.</i></p></blockquote></p><p><small>Aug <var data-var='date'> 2</var>, <var data-var='time'>23:53</var> UTC</small><br><strong>Monitoring</strong> - We're detecting a network connectivity event affecting a single Availability Zone in our AWS Virginia region. General system redundancy is operating as designed, with no impact to customer traffic at this time. However we are standing by to intervene if necessary. https://status.aws.amazon.com/</p>tag:status.websolr.com,2005:Incident/12498042017-05-23T18:12:37Z2018-08-01T19:36:11ZElevated 503s for some users in US-East<p><small>May <var data-var='date'>23</var>, <var data-var='time'>18:12</var> UTC</small><br><strong>Resolved</strong> - This issue has been resolved.</p><p><small>May <var data-var='date'>23</var>, <var data-var='time'>17:36</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p>tag:status.websolr.com,2005:Incident/12262732017-05-05T09:40:03Z2017-05-05T09:40:03ZIncreased rate of 503 errors for some indexes in US East region<p><small>May <var data-var='date'> 5</var>, <var data-var='time'>09:40</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>May <var data-var='date'> 5</var>, <var data-var='time'>08:36</var> UTC</small><br><strong>Update</strong> - Starting at 07:29 UTC, a server failure caused increased 503 errors for roughly 3% of indices in the US East region. The issue was detected and subsequently a fix was implemented at 08:04 UTC. Index traffic is now stable, and further follow-up maintenance is now being performed on those indices affected.</p><p><small>May <var data-var='date'> 5</var>, <var data-var='time'>08:04</var> UTC</small><br><strong>Monitoring</strong> - A fix has been implemented and we are monitoring the results.</p><p><small>May <var data-var='date'> 5</var>, <var data-var='time'>07:52</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p>tag:status.websolr.com,2005:Incident/12156542017-04-27T07:02:00Z2017-04-27T07:22:48ZElevated 503s for some Cobalt/Staging indices in US East<p><small>Apr <var data-var='date'>27</var>, <var data-var='time'>07:02</var> UTC</small><br><strong>Resolved</strong> - The incident has been resolved.</p><p><small>Apr <var data-var='date'>27</var>, <var data-var='time'>06:55</var> UTC</small><br><strong>Identified</strong> - We have identified the issue and are working on a fix.</p><p><small>Apr <var data-var='date'>27</var>, <var data-var='time'>06:30</var> UTC</small><br><strong>Investigating</strong> - Our systems have detected an issue impacting about a dozen Cobalt and Staging indices in US East and we are investigating.</p>tag:status.websolr.com,2005:Incident/12085022017-04-24T18:54:23Z2017-04-24T18:54:23ZElevated 503s for some users in US-East<p><small>Apr <var data-var='date'>24</var>, <var data-var='time'>18:54</var> UTC</small><br><strong>Resolved</strong> - Service has been restored to the impacted server.</p><p><small>Apr <var data-var='date'>24</var>, <var data-var='time'>18:45</var> UTC</small><br><strong>Identified</strong> - We have been automatically paged to respond to a server issue in the US-East region. Resolution will be forthcoming over the next several minutes.</p>tag:status.websolr.com,2005:Incident/11762382017-03-29T14:20:00Z2018-08-01T19:36:11ZElevated 503s for some users in US-East<p><small>Mar <var data-var='date'>29</var>, <var data-var='time'>14:20</var> UTC</small><br><strong>Resolved</strong> - The node has been repaired and is confirmed to be serving traffic normally.</p><p><small>Mar <var data-var='date'>29</var>, <var data-var='time'>14:10</var> UTC</small><br><strong>Identified</strong> - We have identified an unhealthy node in the region and are working to resolve.</p><p><small>Mar <var data-var='date'>29</var>, <var data-var='time'>14:00</var> UTC</small><br><strong>Investigating</strong> - We are investigating reports of persistent HTTP 503 errors in the US-East region.</p>tag:status.websolr.com,2005:Incident/11606122017-03-15T23:26:00Z2018-08-01T19:36:11ZLatency reports in US East<p><small>Mar <var data-var='date'>15</var>, <var data-var='time'>23:26</var> UTC</small><br><strong>Resolved</strong> - We have identified the source of the network bottleneck and addressed it. We will be conducting a RCA to better understand the failure case.</p><p><small>Mar <var data-var='date'>15</var>, <var data-var='time'>23:16</var> UTC</small><br><strong>Investigating</strong> - Looking into reports of latency for some indices in US-East.</p>tag:status.websolr.com,2005:Incident/11420442017-02-28T23:30:49Z2017-02-28T23:30:49ZAWS S3 Outage in Virginia<p><small>Feb <var data-var='date'>28</var>, <var data-var='time'>23:30</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Feb <var data-var='date'>28</var>, <var data-var='time'>18:27</var> UTC</small><br><strong>Monitoring</strong> - Servers are unaffected by AWS S3 Outage, but backups are suspended for the duration.</p>