Unable to download tar.gz of prebuilt dependencies [windows]

runner: windows-latest
python: 2.7.18 (upstream, not me) 

A python build requires some prebuilt windows dependencies that are downloaded from a URL, since a wheel for one needed dependency does not exist for 2.7.

Here’s some debugging related to the download.

Local: Windows 10

('filesize:', 117555902L)
('checksum:', True, '7212f4440fc983a3868ee3b27148681800c9c2942f6525cee34c46a178d2fdc6', '7212f4440fc983a3868ee3b27148681800c9c2942f6525cee34c46a178d2fdc6')

^ pass

windows-latest

filesize: 82
checksum: False 372f1704a60d98b24f6fdd6b341388d1b9cd2ac9b2f2a47ebcf72a917e9c7931 7212f4440fc983a3868ee3b27148681800c9c2942f6525cee34c46a178d2fdc6

^ fail

Exact same logic on both, same python version, completely different behaviors.

The download is handled by requests_download and the file is a tar.gz. The behavior is annoyingly consistent.

It leads me to believe that Github Hosted Runners does not allow downloads from any or certain URLs, but this makes little to no sense since many actions do the exact same thing while setting up environments. If there are “disallowed” commands/behaviors/URLs on Github Hosted Runners, it’s not documented anywhere. I cannot find any topics with the same problem.

What is going on here? I’m at my witt’s end

@dskvr,

It leads me to believe that Github Hosted Runners does not allow downloads from any or certain URLs

I don’t think Github has any special setting to prevent users from downloading files from certain sites.
Generally, you can download files from any open and safe site when using the GitHub-hosted runners. Just like what you do on your local machine.

The following are some examples I test on the windows-latest to download some tar.gz files from different sites. Here I test the commonly used “Invoke-WebRequest” and “curl”, all of them can work fine. The examples include downloading files from a public GitHub repository’s release version, downloading the installation packages of some software from their website.

jobs:
  test_download:
    runs-on: windows-latest
    steps:
      - name: download tar.gz (Invoke-WebRequest)
        run: |
          mkdir download1
          cd download1
          Invoke-WebRequest -Uri https://github.com/BrightRan/TestClock/archive/release/v1.10.11.tar.gz -OutFile TestClock-v1.10.11.tar.gz
          Invoke-WebRequest -Uri https://curl.haxx.se/download/curl-7.71.1.tar.gz -OutFile curl-7.71.1.tar.gz
          Invoke-WebRequest -Uri https://nodejs.org/dist/v12.18.2/node-v12.18.2-darwin-x64.tar.gz -OutFile node-v12.18.2-darwin-x64.tar.gz

      - name: download tar.gz (curl)
        run: |
          mkdir download2
          cd download2
          curl -L https://github.com/BrightRan/TestClock/archive/release/v1.10.11.tar.gz -o TestClock-v1.10.11.tar.gz
          curl -L https://curl.haxx.se/download/curl-7.71.1.tar.gz -o curl-7.71.1.tar.gz
          curl -L https://nodejs.org/dist/v12.18.2/node-v12.18.2-darwin-x64.tar.gz -o node-v12.18.2-darwin-x64.tar.gz

      - name: view download
        run: |
          ls download1
          ls download2

The same commands also can work fine on my local Windows machine.

If possible, please share your repository with us, so that we can check more detailed configurations in the workflow, the command and URL you are using. Then we can analyze the root cause according to these information.

In addition, please try to install a self-hosted runner on your local Windows machine to run the workflow to see if the downloading can work.