Seeing high amounts of errors when trying to start public macOS jobs
Incident Report for Travis CI
Resolved
At this time we've restarted the errored jobs and they are in our backlog again. We are processing jobs at full capacity, with the usual backlog due to high demand for public macOS builds. We'll be publishing a postmortem of this incident by the end of the week. We thank you for your patience and understanding while we resolved this issue.
Posted Mar 15, 2017 - 02:45 UTC
Update
A mistake in the order we brought up some of the services resulted in all pending macOS jobs "erroring out" quickly. We are currently working on resetting the state of those jobs so that they'll be queued and run again. We are very sorry for this issue and will post an update when the jobs have been requeued.
Posted Mar 15, 2017 - 01:52 UTC
Update
We are beginning to ramp up the capacity and are monitoring things closely.
Posted Mar 15, 2017 - 01:36 UTC
Update
We've been able to restore the backplane to service and we're working on verification and preparing to resume builds and ramp back up to full capacity.
Posted Mar 15, 2017 - 01:10 UTC
Update
The backplane has come up in an unexpected state and we're escalating with our infrastructure provider, as we'll need their help in resolving this issue.

In the mean time we continue to run public macOS builds with degraded capacity. travis-ci.com jobs are not affected at this time.

Thank you for your patience while we work to resolve this.
Posted Mar 15, 2017 - 00:47 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Mar 15, 2017 - 00:23 UTC
Update
The control plane restart is still in progress, we discovered that the root filesystem partition had filled up and we weren't alerted to this issue. We've cleaned up the filesystem and are still working to get the backplane services started up again. In the mean time we continue to run public macOS builds with degraded capacity. travis-ci.com jobs are not affected at this time.
Posted Mar 15, 2017 - 00:22 UTC
Update
The "control backplane" for part of our virtualization infrastructure is unstable, so we're initiating a restart of the backplane. In the mean time we continue to run public macOS builds with degraded capacity. travis-ci.com jobs are not affected at this time.
Posted Mar 14, 2017 - 23:43 UTC
Update
We are investigating some intermittent stability errors with some of our physical servers for this infrastructure and we are working to restore stability to this. At this time we're running public macOS builds with degraded capacity.
Posted Mar 14, 2017 - 23:27 UTC
Investigating
We are seeing high amounts of errors when trying to start public macOS jobs. This is causing build delays for travis-ci.org macOS builds and we are investigating why.
Posted Mar 14, 2017 - 22:59 UTC