Help
cancel
Showing results for 
Search instead for 
Did you mean: 
Highlighted
Pilot Lvl 1
Message 1 of 17

Windows builds failing randomly

Solved! Go to Solution.

I have builds set up for Windows (windows-latest) and macOS - it's a matrix, so 2 builds run for each.

 

Sometimes one of the Windows ones fails and cuts off messages like this:

 

2020-04-19T08:57:16.6185991Z -- Configuring done
2020-04-19T08:57:19.0547176Z -- Generating done
2020-04-19T08:57:19.6715826Z CMake Warning:
2020-04-19T08:57:19.6918171Z ##[error]Process completed with exit code 1.
2020-04-19T08:57:19.6939372Z Cleaning up orphan processes

...but the other Windows build works. Then I re-run everything using the "re-run" button and they all pass (or they swap and the other Windows one fails this time).

 

Seems like something is killing the process randomly but don't have access to any additional info - just the cut-off logs.

 

This is happening frequently enough that it's annoying because the macOS ones randomly get "cancelled by user" when nobody has touched it as well :-)

 

Windows builds take ~20-25 minutes, so manually re-running all the time is not a good option.

 

Any idea how to troubleshoot this?

16 Replies
Highlighted
GitHub Partner
Message 2 of 17

Re: Windows builds failing randomly

Could you please enable runner diagnostic logging and enable step debug logging in your workflow? Let's check whether there are more information in detail. And if you don't mind, please share a trouble workflow run with debug logs here?  

 

Please note that, the debug logging will be enabled for all the workflows in the current repository, after you getting a debug log for a failure workflow run, you could disable it.  

Highlighted
Pilot Lvl 1
Message 3 of 17

Re: Windows builds failing randomly

Thank you for the reply. I don't have admin access to set that up. I'll have to try to get the organization admin to do it.

 

In the meantime, here's one that failed like this (without the extra debug information). And here is the workflow YAML.

 

When I re-run these, they usually pass the first time, but sometimes it takes a couple of tries.

 

I notice that it's (often? always?) when it's doing cmake, so I wonder if it's a bug in the Windows cmake that's installed.

 

I'll update when I can get that extra information.

Highlighted
Pilot Lvl 1
Message 4 of 17

Re: Windows builds failing randomly

Since I can't seem to attach anything here in the forum I uploaded the logs over here.

 

I took a look through them and can't see anything obvious.

 

The one that failed this time is Windows Latest MSVC SCALAR_DOUBLE=OFF in step 7_Configure CMake.txt (sometimes it's Windows Latest MSVC SCALAR_DOUBLE=ON that fails though).

 

...
2020-04-21T22:54:16.1043114Z -- QHULL found (include: C:/Miniconda3/envs/ccenv/Library/include, lib: optimized;C:/Miniconda3/envs/ccenv/Library/lib/qhullstatic.lib;debug;C:/Miniconda3/envs/ccenv/Library/lib/qhullstatic.lib) 2020-04-21T22:54:16.1043705Z -- looking for PCL_COMMON 2020-04-21T22:54:16.1058916Z -- Found PCL_COMMON: C:/Miniconda3/envs/ccenv/Library/lib/pcl_common_release.lib 2020-04-21T22:54:16.8933755Z CMake Warning (dev) at C:/Program Files/CMake/share/cmake-3.17/Modules/FindPackageHandleStandardArgs.cmake:272 (message): 2020-04-21T22:54:19.1474864Z ##[error]Process completed with exit code 1. 2020-04-21T22:54:19.1478756Z ##[debug]Finishing: Configure CMake

 

I don't see anything in the additional Runner and Worker logs that looks strange, but I don't know what I'm looking for exactly.

Highlighted
GitHub Partner
Message 5 of 17

Re: Windows builds failing randomly

I have forked your repo to my org. I met the same error as yours. I am trying to involve github engineering team to further investigate this issue. There might be some time delay. Appreciate your patience.

Highlighted
Pilot Lvl 1
Message 6 of 17

Re: Windows builds failing randomly

Great - thank you for your time.

 

This was a very large time sink yesterday as I had to re-run dozens of times.

 

If you (or they) have any suggestions on how to reduce the build time on Windows (~25 minutes) I'd love to hear that as well :-)

Highlighted
Pilot Lvl 1
Message 7 of 17

Re: Windows builds failing randomly

I have another run that keeps failing - I've tried it at least a dozen times today. I know it should pass because it passed the PR request stage, was merged, and now the merged CI keeps failing :-)

 

I turned on the logging and found this in one of the "worker" logs - maybe it will help?

 

 

[2020-04-23 17:14:49Z INFO HostContext] No proxy settings were found based on environmental variables (http_proxy/https_proxy/HTTP_PROXY/HTTPS_PROXY)
[2020-04-23 17:14:49Z INFO HostContext] Well known directory 'Bin': 'C:\runners\2.169.1\bin'
[2020-04-23 17:14:49Z INFO HostContext] Well known directory 'Root': 'C:\runners\2.169.1'
[2020-04-23 17:14:49Z INFO HostContext] Well known config file 'Credentials': 'C:\runners\2.169.1\.credentials'
[2020-04-23 17:14:49Z INFO Worker] Version: 2.169.1
[2020-04-23 17:14:49Z INFO Worker] Commit: 4f840647b375b65a4efa361abc1b84f783442de6
[2020-04-23 17:14:49Z INFO Worker] Culture: en-US
[2020-04-23 17:14:49Z INFO Worker] UI Culture: en-US
[2020-04-23 17:14:49Z ERR  Worker] System.ArgumentException: Handle has been disposed or is invalid. (Parameter 'pipeHandleAsString')
   at System.IO.Pipes.AnonymousPipeClientStream..ctor(PipeDirection direction, String pipeHandleAsString)
   at GitHub.Runner.Common.ProcessChannel.StartClient(String pipeNameInput, String pipeNameOutput)
   at GitHub.Runner.Worker.Worker.RunAsync(String pipeIn, String pipeOut)
   at GitHub.Runner.Worker.Program.MainAsync(IHostContext context, String[] args)

 

 

Highlighted
GitHub Partner
Message 8 of 17

Re: Windows builds failing randomly

Thank you for looking deep into this issue . I am still trying to contact to senior engineer , your patience would be appreciated. 

Highlighted
Pilot Lvl 1
Message 9 of 17

Re: Windows builds failing randomly

Thank you.

 

We have had to re-run these things dozens of times (and you can't just re-run the ones that fail - it's all-or-nothing), so not only has this been very frustrating, it has been a major time sink and a waste of build resources at GitHub.

 

Combine this with the macOS runners randomly deciding someone cancelled them, and it shakes my confidence in Actions as a CI system. It kind of goes without saying that we need to be able to trust this and rely on it to be rock-solid.

Highlighted
Copilot Lvl 2
Message 10 of 17

Re: Windows builds failing randomly

Hi @asmaloney !

Thank you for your patience.

Could you please check the investigation results below and see if it helps to solve the issue?

 

It turned out that by default CMake is not displaying its internal errors, and in order to find a reason of the fail we changed the workflow to show the CMake logs and saw there's a compilation error from the C++ build itself

CMakeError.log content
Performing C++ SOURCE FILE Test COMPILER_HAS_DEPRECATED_ATTR failed with the following output:
Change Dir: D:/a/CloudCompare/CloudCompare/build/CMakeFiles/CMakeTmp
Run Build Command(s):C:/Miniconda3/envs/ccenv/Library/bin/ninja.exe cmTC_144a7 && [1/2] Building CXX object CMakeFiles\cmTC_144a7.dir\src.cxx.obj
FAILED: CMakeFiles/cmTC_144a7.dir/src.cxx.obj 
C:\PROGRA~2\MICROS~1\2019\ENTERP~1\VC\Tools\MSVC\1425~1.286\bin\Hostx64\x64\cl.exe  /nologo /TP   /DWIN32 /D_WINDOWS /W3 /GR /EHsc -DCOMPILER_HAS_DEPRECATED_ATTR /MDd /Zi /Ob0 /Od /RTC1   -std:c++14 /showIncludes /FoCMakeFiles\cmTC_144a7.dir\src.cxx.obj /FdCMakeFiles\cmTC_144a7.dir\ /FS -c src.cxx
src.cxx(1): error C2065: '__deprecated__': undeclared identifier
src.cxx(1): error C4430: missing type specifier - int assumed. Note: C++ does not support default-int
src.cxx(1): error C2448: '__attribute__': function-style initializer appears to be a function definition
ninja: build stopped: subcommand failed.

The diff to show logs: dsame/CloudCompare@e8248d4 (idea is to surround cmake... with try...catch and output all the logs)
The sample build: https://github.com/dsame/CloudCompare/runs/645255014?check_suite_focus=true