Serializing Map Data Structures from Shell Scripts
Serializing Map Data Structures from Shell Scripts
by Dan Manges

Serializing Map Data Structures from Shell Scripts

Shell scripts are commonly used for automating DevOps and CI/CD tasks. Shells are ubiquitous. No additional dependencies are required. They're designed for running scripts and making it easy to do things like handle input and output streams.

Unfortunately shells do not provide an easy way to build a map data structure and serialize it directly from a script, which usually leads to engineers inventing a custom format instead. There is a commonly used format, but we came up with a different approach when building Mint.

Common Map Format

Many platforms choose to write values to a single file using = as a delimiter. For example:

1
2
key1=value1
key2=value2

This works easily enough for values on a single line, but what happens when values contain a newline?

That problem can be solved by using a different syntax to indicate a multiline value, using an EOF indicator for the end of the value.

1
2
3
{key}<<{delimiter}
{value}
{delimiter}

For example:

1
2
3
4
key1<<EOF
line1
line2
EOF

This is fairly simple, although it's starting to make the generation and the parsing a little bit more complex.

This format can also pose a risk of injection. At one point GitHub Actions flagged this as a security risk and recommended generating random values for the EOF indicator. Their docs contained this as the recommended implementation:

1
2
3
4
EOF=$(dd if=/dev/urandom bs=15 count=1 status=none | base64)
echo "JSON_RESPONSE<<$EOF" >> "$GITHUB_ENV"
curl https://example.com >> "$GITHUB_ENV"
echo "$EOF" >> "$GITHUB_ENV"

Technically, if injection risk is a concern, the random delimiter should be used for all values, not just the multiline values.

A random EOF indicator does not affect parsing complexity, but it does make writing the values more laborious.

A Simpler Solution using Directories

When we started building output values and environment variables for Mint, our CI/CD platform, we originally started using the same commonly used format.

However, we've put tremendous consideration into the interfaces that we've created and the overall developer experience. We didn't want people to have to generate random nonces to write multiline values or be susceptible to the risk of injection.

After exploring some ideas, we realized that the operating system already provides a natural and convenient way to write maps: files.

In Mint, we write values to files in a directory. We use the key as the filename, and the value as the contents of the file. It's easy to generate, easy to parse, and does not pose security risks.

1
echo value1 > $MINT_VALUES/key1

We use the same approach for environment variables:

1
echo hello > $MINT_ENV/EXAMPLE_ENV

Regarding the previous example, instead of needing to do this on GitHub Actions:

1
2
3
4
EOF=$(dd if=/dev/urandom bs=15 count=1 status=none | base64)
echo "JSON_RESPONSE<<$EOF" >> "$GITHUB_ENV"
curl https://example.com >> "$GITHUB_ENV"
echo "$EOF" >> "$GITHUB_ENV"

You can do this in Mint:

1
curl https://example.com >> "$MINT_ENV/JSON_RESPONSE"

There could be a slight performance impact if needing to serialize a large number of values, but that's uncommon. In general with Mint, we strive to make the more common use case simpler and more secure, and then develop other solutions for less common use cases.

Follow Along

We write a lot about software engineering best practices and CI/CD pipelines in Mint. Follow along on X at @rwx_research, LinkedIn, or our email newsletter

Enjoyed this post? Share it!