Interfacing with Database : Python and Go Implementation
While I am a systems engineer by chance, but recent events make me want to work and brush up my data skills again.
Originally posted in : https://nng.io/interfacing-with-database-python-go-implementation/
It goes without saying that every journey in data analytics starts by learning SQL, the most mainstream language that is used to access almost all kind of databases out there. Since I want to kickstart the journey, I picked SQLite as my practice environment because of its self-contained and zero-configuration principles, making it the most convenient DB out there for this purpose.
For the last couple of years, I am getting more acquainted with two programming languages to support me in my line of work: Python
and Go
. I can say that Python is a mature language to work with a DB ; I have made a simple tool to import a CSV file into SQLite before. Now I wanted to know how the latter compares when creating a simple API to interact with SQLite DB. As a reference point, I am using tutorials from http://www.sqlitetutorial.net/, and to add some juices along the way, I decided to apply my aforementioned programming language knowledge in following the tutorials.
So let’s get started.
The Setup
sqlite3 3.24.0
provided by MacOS Mojave 10.14Python 3.7.2
go version 1.12 darwin/amd64
chinook.db
sample database
There are 11 tables in the chinook sample database. To make things simple, I decided to compare how the two programming languages work theSELECT * FROM tracks;
statement in terms of number of lines of code needed to execute aforementioned statement and deal with the returned rows afterwards.
Python
It’s simple: first you import the sqlite3
module:
import sqlite3
Then, create a Connection object that represents the database, create a Cursor object from it, and call its execute()
method to execute SQL command.
conn = sqlite3.connect("./chinook.db")
cursor = conn.cursor()
cursor.execute("SELECT * FROM tracks;")
Since the command returns matching rows, assign the return value of fetchall()
to a variable containing list of matching rows.
rows = cursor.fetchall()
If you want to print the elements to standard output, just iterate through its elements.
for row in rows:
print(row)
Lastly, don’t forget to close the DB objects. You see that all can be done in 14–15 lines of code, and we don’t have to worry that much while handling NULL-able fields compared to Go (more on that later).
cursor.close()
conn.close()
Go
Same logic, we have to import the package AND the specific database drivers (which must be installed separately). This way, the driver implementation is abstracted under the hood and we have a DB-independent API to interact with our database (although there are still differences in things such as arguments provided to a method to connect to DBs may have different semantics).
package main
import (
"database/sql"
_ "github.com/mattn/go-sqlite3"
"fmt"
"log"
)
We have to create a db.SQL
object, a database interface abstraction to be used later for creating statements and transactions.
db, err := sql.Open("sqlite3", "./chinook.db")
if err != nil {
log.Fatal(err)
}
defer db.Close()
Since SELECT * FROM xxx;
most likely will return multiple rows, we use the Query
method which will return a *Rows
object that we can iterate and scan in a loop.
rows, err := db.Query("SELECT * FROM tracks;")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
Here, I stumbled into the one of many major differentiators between Go and Python implementation when extracting data from the query statement; While Python’s sqlite3’s fetchall()
provides some sort of convenience by returning query results into a list of rows, in Go we have to statically define the columns' variables and their types.
var (
trackid int
name string
albumid int
mediatypeid int
genreid int
composer *string
milliseconds int
bytes int
unitprice float64
)
Note that composer
is of *string
(pointer to string) type because I found out that field is NULL-able. If you define it as only string
, then you will get the following error
during runtime when it scans a NULL value under that field:
Scan error on column index 5, name "Composer": unsupported Scan, storing driver.Value type <nil> into type *string
Next, we iterate and scan through rows
and create a simple conditional to handle the case with aforementioned NULL-able field by replacing it with "empty_composer" when the the code scans the NULL value:
for rows.Next() {
err := rows.Scan(&trackid,&name,
&albumid,&mediatypeid,&genreid,
&composer,&milliseconds,&bytes,
&unitprice)
if err != nil {
log.Fatal(err)
}
if composer == nil {
temp := "empty_composer"
composer = &temp
}
fmt.Println(trackid,name,albumid,
mediatypeid,genreid,*composer,
milliseconds,bytes,unitprice)
}
We can see that to accomplish the same result, we need around thrice the amount of lines of code in Golang compared to Python. It’s not that baffling considering its static type system and error-handling philosophy (we can skip it entirely of course, but it’s not idiomatic).
Conclusion
As stated, I am not trying to benchmark these two languages in terms of performance; a lot of posts can give a better explanation regarding this. I personally find that Python has the edge in term of library support for this kind of workload and has more user-friendly feel to it. Go is a relatively new language but with its popularity continue to rise I think it will get there in time.