A TDD Example for Bash Command Chaining in Go
Every sysadmin knows the following command chain to produce a hashed string :
$ echo MyBankPassword | md5sum | awk '{print $1}'
2210253a2a6e6482bb373b668d7b3e90
Out of curiosity, I decided to write a simple function in Go to try emulating the same behavior :
The logic of the function are :
- Accepts a string as input and converts it to a byte slice to be consumed by
Sum
function of themd5
library. This method returns a byte array of 16 in length. - Re-convert the byte array into string by passing the whole array into
EncodeToString
method of thehex
package.
Let’s see the result, using the same MyBankPassword
as the input string :
$ bashHash=`echo MyBankPassword | md5sum | awk '{print $1}'`$ goHash=`go run main.go`$ if [ $bashHash == $goHash ]; then echo "${bashHash} <-> ${goHash} is true"; else echo "${bashHash} <-> ${goHash} is false"; fi2210253a2a6e6482bb373b668d7b3e90 <->
93d290d421543618bbfccaa7aea739d6 is false
Uh-oh, it yielded different hash results. Why?
The Case of Trailing Newline
The difference in results is caused by the implicit \n
character at the end of line when calling echo
using default arguments.
As seen in echo.c
from GNU coreutils package:
if (display_return)
putchar ('\n');
display_return
is a flag that is set to true
by default, and will only be switched to false
if we use the -n
flag during execution :
case 'n':
display_return = false;
break;
To properly emulate the above behavior, I added a little modification to the MD5Hash
function :
I added one newline char in the byte array representation of the string that needs to be hashed, before it’s passed as the argument to the Sum
function call, for the method to produces the same output :
$ goHash=`go run main.go`$ if [ $bashHash == $goHash ]; then echo "${bashHash} <-> ${goHash} is true"; else echo "${bashHash} <-> ${goHash} is false"; fi2210253a2a6e6482bb373b668d7b3e90 <-> 2210253a2a6e6482bb373b668d7b3e90 is true
TDD for Bash Command Chain Emulation
The MD5Hash
function defined above is a wrapper for the underlying functions called from the crypto/md5
package. In that case, there is little reason for me to write an unit test for it since I can be quite confident that the maintainers of the codebase have written good tests.
Again, just out from curiosity, I wrote an unit test for my MD5Hash
function. I wanted to know if there is a way to check this function can produce the same result if I were to use run-of-the-mill bash tools.
First, the test :
I defined a simple test case for hashing, using two different strings as inputs. Then I compare the result of MD5Hash
function against the one produced by the bash command chain echo ${strInput} | md5sum | awk '{printf $1}'
(notice that I don’t need to pass the '
in the awk
variable.)
The three bash commands above are passed as arguments to BashPipedCommand
function which itself is encapsulated inside the private function bashWrapper
.
Now, let’s define the function BashPipedCommand
:
The logic as follows :
- The function accepts an arbitrary number of bash commands that is larger or equal to 1, which will be stored in a
cmds
slice. - Two buffers to store the
stdout
andstderr
of the bash command chain are created. - The code loops through the elements of the
cmds
slice, where thestdout
of the previous command is ‘connected’ to thestdin
of the next command in chain. - The
output
buffer is used to store thestdout
of the last command in chain, and also collect thestderr
, if any. - As the previous steps are only used to define the target buffer for the command invocation, the commands themselves need to be actually executed.
- The code then waits for each command to complete to finally store the output / errors of the commands in the buffers.
- Finally all the defined return values are returned as byte arrays.
After all that, let’s run the test :
Running tool: /home/linuxbrew/.linuxbrew/bin/go test -timeout 30s -coverprofile=/tmp/vscode-goHjZVHu/go-code-cover github.com/GandhiNN/anonymizer/hasherok github.com/GandhiNN/anonymizer/hasher 0.006s coverage: 53.8% of statements
It passes.
Note that the low code coverage is expected, since we are only interested in testing the business logic i.e. the hashing, not the whole code.
Takeaways
- Be thorough when emulating bash tools’ behavior into Go, especially if it has many options of invocations. Reading the
man
entry for the tool is a good start. - You can emulate bash command chaining in Go. If your source of truth is the output of a particular bash tool, you can check your Go code’s logic by comparing the output of your function with the output of the bash command chain, in one codebase.
- Test the business logic. You don’t need to test a wrapper to a properly written function provided inside a stable package. As can be seen that I did not write unit tests for methods that are derived from
crypto/md5
andos/exec
package.
The full code can be found here.