A TDD Example for Bash Command Chaining in Go

Ngakan Nyoman Gandhi
4 min readApr 29, 2021

--

Every sysadmin knows the following command chain to produce a hashed string :

$ echo MyBankPassword | md5sum | awk '{print $1}'
2210253a2a6e6482bb373b668d7b3e90
Image courtesy of CloudFlare

Out of curiosity, I decided to write a simple function in Go to try emulating the same behavior :

The logic of the function are :

  1. Accepts a string as input and converts it to a byte slice to be consumed by Sum function of the md5 library. This method returns a byte array of 16 in length.
  2. Re-convert the byte array into string by passing the whole array into EncodeToString method of the hex package.

Let’s see the result, using the same MyBankPassword as the input string :

$ bashHash=`echo MyBankPassword | md5sum | awk '{print $1}'`$ goHash=`go run main.go`$ if [ $bashHash == $goHash ]; then echo "${bashHash} <-> ${goHash} is true"; else echo "${bashHash} <-> ${goHash} is false"; fi2210253a2a6e6482bb373b668d7b3e90 <-> 
93d290d421543618bbfccaa7aea739d6 is false

Uh-oh, it yielded different hash results. Why?

The Case of Trailing Newline

The difference in results is caused by the implicit \n character at the end of line when calling echo using default arguments.

As seen in echo.c from GNU coreutils package:

if (display_return)    
putchar ('\n');

display_return is a flag that is set to true by default, and will only be switched to false if we use the -n flag during execution :

case 'n':              
display_return = false;
break;

To properly emulate the above behavior, I added a little modification to the MD5Hash function :

I added one newline char in the byte array representation of the string that needs to be hashed, before it’s passed as the argument to the Sum function call, for the method to produces the same output :

$ goHash=`go run main.go`$ if [ $bashHash == $goHash ]; then echo "${bashHash} <-> ${goHash} is true"; else echo "${bashHash} <-> ${goHash} is false"; fi2210253a2a6e6482bb373b668d7b3e90 <-> 2210253a2a6e6482bb373b668d7b3e90 is true

TDD for Bash Command Chain Emulation

The MD5Hashfunction defined above is a wrapper for the underlying functions called from the crypto/md5 package. In that case, there is little reason for me to write an unit test for it since I can be quite confident that the maintainers of the codebase have written good tests.

Again, just out from curiosity, I wrote an unit test for my MD5Hashfunction. I wanted to know if there is a way to check this function can produce the same result if I were to use run-of-the-mill bash tools.

First, the test :

I defined a simple test case for hashing, using two different strings as inputs. Then I compare the result of MD5Hash function against the one produced by the bash command chain echo ${strInput} | md5sum | awk '{printf $1}' (notice that I don’t need to pass the ' in the awk variable.)

The three bash commands above are passed as arguments to BashPipedCommand function which itself is encapsulated inside the private function bashWrapper .

Now, let’s define the function BashPipedCommand:

The logic as follows :

  • The function accepts an arbitrary number of bash commands that is larger or equal to 1, which will be stored in a cmds slice.
  • Two buffers to store the stdout and stderr of the bash command chain are created.
  • The code loops through the elements of the cmdsslice, where the stdoutof the previous command is ‘connected’ to the stdinof the next command in chain.
  • The output buffer is used to store the stdoutof the last command in chain, and also collect the stderr, if any.
  • As the previous steps are only used to define the target buffer for the command invocation, the commands themselves need to be actually executed.
  • The code then waits for each command to complete to finally store the output / errors of the commands in the buffers.
  • Finally all the defined return values are returned as byte arrays.

After all that, let’s run the test :

Running tool: /home/linuxbrew/.linuxbrew/bin/go test -timeout 30s -coverprofile=/tmp/vscode-goHjZVHu/go-code-cover github.com/GandhiNN/anonymizer/hasherok      github.com/GandhiNN/anonymizer/hasher   0.006s  coverage: 53.8% of statements

It passes.

Note that the low code coverage is expected, since we are only interested in testing the business logic i.e. the hashing, not the whole code.

Takeaways

  1. Be thorough when emulating bash tools’ behavior into Go, especially if it has many options of invocations. Reading the man entry for the tool is a good start.
  2. You can emulate bash command chaining in Go. If your source of truth is the output of a particular bash tool, you can check your Go code’s logic by comparing the output of your function with the output of the bash command chain, in one codebase.
  3. Test the business logic. You don’t need to test a wrapper to a properly written function provided inside a stable package. As can be seen that I did not write unit tests for methods that are derived from crypto/md5 and os/exec package.

The full code can be found here.

--

--

Ngakan Nyoman Gandhi
Ngakan Nyoman Gandhi

Written by Ngakan Nyoman Gandhi

PMI Technology. Network/Data Engineering. Speaks in Command-Line Interface. Live with One, Die with Zero.

No responses yet