How to share data between tasks in GitHub Actions (examples)
    • news
    • -
    • How to share data between tasks in GitHub Actions (examples)

    How to share data between tasks in GitHub Actions (examples)

    How to share data between tasks in GitHub Actions (examples)Recently I had the opportunity to work in GitHub actions-I had to share the built frontend project between many tasks. I did some digging, and I figured out all the ways you can do that. In this article I will share them with you. GitHub actions is a way to automate work with CI / CD running directly on the repository on GitHub, which can be enabled by adding a yaml file.

    I'm going to focus here on how to share data between the tasks we do in GHA. And if you want to refresh the basics, then click here.

    There are 2 ways to share data between tasks in GitHub Actions:

    1. Cache
    2. Artifacts


    GitHub has an action called actions/cache (repository). We can use it to load the cache and retrieve it from various tasks (documentation).

    Let's look at the following example:

        - name: Cache node modules
          uses: actions/cache@v2
            cache-name: cache-node-modules
            # npm cache files are stored in `~/.npm` on Linux/macOS
            path: ~/.npm
            key: ${{ runner.os }}-build-${{ env.cache-name }}-${{ hashFiles('**/package-lock.json') }}
            restore-keys: |
              ${{ runner.os }}-build-${{ env.cache-name }}-
              ${{ runner.os }}-build-
              ${{ runner.os }}-

    It's pretty easy to use. The shares themselves have 3 parameters:

    • path(required parameter): the path of the file to be cached or downloaded from the cache. path can be an absolute or relative path to the working directory.
    • key (required parameter): this is the key created when saving the cache, and is used to search for them. The key could be anything.
    • restore-keys(optional parameter): list of alternative keys that we use to search the cache if no cache hit occurred on key.

    This action throws cache-hit - we can also use it to trigger some action in the following way:

        - name: Cache Primes
          id: cache-primes
          uses: actions/cache@v2
            path: prime-numbers
            key: ${{ runner.os }}-primes
        - name: Generate Prime Numbers
          if: steps.cache-primes.outputs.cache-hit != 'true'
          run: /generate-primes.sh -d prime-numbe

    As you can see, it is very simple and actually immediately ready to use. And now let's move on to the next way.

    Loading and retrieving artifacts

    Artifacts are used to save files so that you can use them when the startup is complete. They are also used to share data between tasks in GHA (documentation). To create and use an artifact, you need the following actions: upload and download.

    To load a file or directory, use these steps as follows:

    - uses: actions/checkout@v2
    - run: mkdir -p path/to/artifact
    - run: echo hello > path/to/artifact/world.txt
    - uses: actions/upload-artifact@v2
        name: my-artifact
        path: path/to/artifact/world.txt

    You only need to provide 2 parameters:

    • name: artifact name
    • path: file/directory path

    Then we download the artifact. Now we can use it:

    - uses: actions/checkout@v2
    - uses: actions/download-artifact@v2
        name: my-artifact

    The only thing we need to provide here is the name of the artifact saved, which is my-artifact.


    Having learned both ways, let's look at how they differ from each other (documentation). Both approaches are used to store files on GitHub, but in a different way. Here is the most important difference between them.

    Caching is used to reuse data / files in different tasks or workflows. Instead, artifacts are used to save files when the work is finished.

    If you want, for example, to share a version in multiple tasks, then I recommend choosing cache, because it is simply faster. Loading artifacts can take a long time, especially if we have large file sizes.

    On the other hand, for logs, test results, and reports, it is better to use artifacts.


    Here is a workflow that uses both caches and artifacts to share built NextJS projects:

    You can probably see a small difference between the use of cache and artifacts, and this is probably because this is just an example and the built file is small-because it is practically empty.

    The difference is negligible, but imagine that you have a real project, and then the time difference will be considerable. I have already seen such differences as: upload (about 5 minutes), download (about 2 minutes). Now, let's compare this with the cache, which usually takes a few seconds. After loading a given artifact, you can download it:

    You can also use both methods at once. For example, if you don't want to wait for the build version to load, but you care about the build behavior, you can use the cache and then load the build asynchronously as an artifact. All this while other tasks are running, so you don't have to wait for the loading to finish.


    I hope the above article will help you understand the problem of sharing data between tasks in GitHub actions. And here are a few more links:

    You can read the original English text here.



    1A Sportyvna sq, Kyiv, Ukraine 01023

    2187 SW 1st St, Miami, FL 33135, USA