Nahornyi Oleh

Introduction

In this article, we explore three approaches to fetching data from SQL databases in Go. We’ll start with the conventional method based on JSON marshaling, move on to a more flexible dynamic approach using maps, and finish with an optimized technique that minimizes overhead and maximizes performance. Each method reveals a different balance between simplicity, flexibility, and efficiency — essential trade-offs for any Go developer working with databases.

In my practice, there were very frequent cases where I had to retrieve data from a database with an unknown structure beforehand. This often happens in the e-commerce industry.

We will consider 3 methods of retrieving data, along with their advantages and disadvantages.

You can get all the code of the examples from the repository
https://github.com/oleg578/dbf

Data preparing

For demonstration purposes, we’ll use MariaDB, though the examples apply equally well to MySQL or any other SQL-compatible database. The focus is on the Go code and data-handling patterns, not on the specific database engine.

Code for create MariaDB docker instance and seed test data are in https://github.com/oleg578/dbf/tree/main/db

All examples are tested on Ubuntu 24.04.1 LTS (Noble Numbat) with 11th Gen Intel Core i5–1135G7 and 16GiB RAM.

Standard way

The standard way is trivial — we fetch rows from database into array and then Marshal it into JSON struct

rs, errRs := con.QueryContext(ctx, q, numbRows)

...

 for rs.Next() {
  var dmy = Dummy{}
  if err := rs.Scan(
   &dmy.ID,
   &dmy.Product,
   &dmy.Description,
   &dmy.Price,
   &dmy.Qty,
   &dmy.Date); err != nil {
   panic(err)
  }
  result = append(result, dmy)
 }
 if err := rs.Err(); err != nil {
  panic(err)
 }
 msg, errRTJ := json.Marshal(result)
 if errRTJ != nil {
  panic(errRTJ)
 }

...

_, errOut := os.Stdout.Write(msg)
...

What about speed?
Test result:

% ./db2json_struct 10000000 1>/dev/null Elapsed time: 12631 ms, HeapAlloc = 5400.725 MB, Sys = 7099.447 MB 10000000 records read

Let’s just remember this as a starting point.

Using map way

Next we consider fetching unknown list of columns — like “SELECT * FROM …”.
The sequence of actions is simple.
Each record will be represent as map[string]interface{}, then

// create result slice
// numbRows is number of rows in result
outRows := make([]map[string]interface{}, 0, numbRows)

We will not serialize each record to save program execution time, and our actions are not complex.
See https://github.com/oleg578/dbf/tree/mapping/example

After fetch rows from database, we will request an slice of columns

columns, err := rs.Columns()

Create slice of values and slice of pointers for data

values := make([]interface{}, len(columns))
valuePointers := make([]interface{}, len(values))
  for i := range values {
    valuePointers[i] = &values[i]
  }

Then for each row we get the map which represent model — https://github.com/oleg578/dbf/blob/mapping/sql2json.go

func Row2Map(columns []string, values []interface{}) (map[string]interface{}, error) {
 rowMap := make(map[string]interface{})
 if len(columns) != len(values) {
  return nil, errors.New("columns and values length not equal")
 }
 for i, col := range columns {
  rowMap[col] = assignCellValue(values[i]) // we will help to typify the value
 }
 return rowMap, nil
}

func assignCellValue(val interface{}) interface{} {
 if b, ok := val.([]byte); ok {
  if floatValue, err := strconv.ParseFloat(string(b), 64); err == nil {
   return floatValue
  }
  return string(b)
 }
 return val
}

Note:
You may want to pay attention to the function assign CellValue — it's purpose is to pre-assign a type to column values. Simple trick — this function tells the JSON encoder which values to accept as non-numeric.
Benchmark:
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz BenchmarkRow2Map BenchmarkRow2Map-8 7376358 159.0 ns/op 336 B/op 2 allocs/op

Finally, execution time of our example — https://github.com/oleg578/dbf/blob/mapping/example/main.go
% ./db2json_map 10000000 1>/dev/null Elapsed time: 12152 ms, HeapAlloc = 8966.899 MB, Sys = 12248.287 MB 10000000 records read

What can we say about these results?

Benefits — We can obtain structures that are not predefined.

Disadvantages — there are several.
But we pay for this with memory consumption.
It’s easy to see that such an implementation consumes 1.5 times more memory.

The second disadvantage which can give us a headache are data types. We must define what we will access as numeric data, and what we will define as string. In current we use AssignValue auxiliary function:

func assignCellValue(val interface{}) interface{} {
 if b, ok := val.([]byte); ok {
  if floatValue, err := strconv.ParseFloat(string(b), 64); err == nil {
   return floatValue
  }
  return string(b)
 }
 return val
}

If value can be represented as float — then we define it as interface value (json library will define it as numeric or null), else as string.

The third disadvantage is map structure property — we can’t guarantee order of fields in json. But this disadvantage may be unimportant.

We cannot say that the result we got is satisfactory.He devours memory. On small datasets this may be acceptable, but this can negatively affect processing with large amounts of data.
https://github.com/oleg578/dbf/tree/mapping

What can we improve?

Let’s look at the weak points in our algorithm of actions when we use maps.

This a creating map[string]interface{} for each row — this very expensive operation in terms of resources and processor time;
And another too expensive operation — JSON marshaling the final slice which can be very big.

It’s time to think about improvement

Let’s play with data structures

When fetching data from a SQL table, the order of columns is always defined by the database — deterministic and consistent across rows. This means we can skip the overhead of maps and instead rely on two coordinated slices: one holding column names, the other holding their corresponding values. It’s simpler, faster, and perfectly aligned with the way the database delivers data.

The next refinement is to request values as byte slices and retrieve column metadata via ColumnTypes. This approach gives us type information up front while keeping data handling lightweight — an efficient foundation for streaming, type conversion, or dynamic encoding later on.

columns, err := rs.ColumnTypes()
 if err != nil {
  log.Fatalf("fault get column types: %v", err)
 }
 values := make([]sql.RawBytes, len(columns))
 valuePointers := make([]interface{}, len(values))
 for i := range values {
  valuePointers[i] = &values[i]
 }

So, we are ready to fetch data in a new way and it’s time to serialize this data.
The JSON library is heavy, but we can make serialization easier.

Columns are simple token-level data usually — we can turn them into JSON strings, then we will just escape special symbols by our escape function -

func escape(in []byte) []byte {
 var out bytes.Buffer
 for _, b := range in {
  switch b {
  case '\n', '\r', '\t', '\b', '\f', '\\', '"':
   out.WriteByte('\\')
   out.WriteByte(b)
  case '/':
   out.WriteByte('\\')
   out.WriteByte(b)
  default:
   if b < 32 || b == 127 {
    out.WriteString(`\u00`)
    out.WriteString(strconv.FormatInt(int64(b), 16))
   } else {
    out.WriteByte(b)
   }
  }
 }
 return out.Bytes()
}

Now let’s think about data types. We can determine the type of column (using sql ColumnType structure sql.ColumnType)

func isDigit(c *sql.ColumnType) bool {
 switch c.DatabaseTypeName() {
 case "TINYINT":
  return true
 case "SMALLINT":
  return true
 case "MEDIUMINT":
  return true
 case "BIGINT":
  return true
 case "INT":
  return true
 case "INT1":
  return true
 case "INT2":
  return true
 case "INT3":
  return true
 case "INT4":
  return true
 case "INT8":
  return true
 case "BOOL":
  return true
 case "BOOLEAN":
  return true
 case "DECIMAL":
  return true
 case "DEC":
  return true
 case "NUMERIC":
  return true
 case "FIXED":
  return true
 case "NUMBER":
  return true
 case "FLOAT":
  return true
 case "DOUBLE":
  return true
 default:
  return false
 }
}

And finally, let’s apply primitive serialization:

func Row2Json(columns []*sql.ColumnType, values []sql.RawBytes) (string, error) {
 if len(values) == 0 {
  return "", errors.New("no data in values")
 }
 if len(columns) != len(values) {
  return "", errors.New("columns and values length not equal")
 }
 var buff strings.Builder
 buff.WriteByte('{')
 for i, val := range values {
  buff.WriteByte('"')
  buff.WriteString(columns[i].Name())
  buff.WriteByte('"')
  buff.WriteByte(':')
  if len(val) > 0 {
   if !isDigit(columns[i]) {
    buff.WriteByte('"')
   }
   buff.Write(escape(val))
   if !isDigit(columns[i]) {
    buff.WriteByte('"')
   }
  } else {
   buff.WriteString("null")
  }
  if i != len(values)-1 {
   buff.WriteByte(',')
  }
 }
 buff.WriteByte('}')

 return buff.String(), nil
}

Benchmark:
cpu: 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz BenchmarkRow2Json BenchmarkRow2Json-8 2881545 385.3 ns/op 440 B/op 9 allocs/op

Use UNIX way to output

We will use UNIX way of output of our program — i.e. we will not create an output slice — we will out data into standard output stream instead — then we can use the output as a standard pipe in UNIX:

// create new buffer
 writer := bufio.NewWriter(os.Stdout)
 defer writer.Flush()

 writer.WriteByte('[') //start print array of data
 ...
 msg, errMsg := dbf.Row2Json(columns, values)
 ...
 if _, err := writer.WriteString(msg); err != nil {
   log.Fatalf("fault write row: %v", err)
  } // write serialized row
 ...
 writer.WriteByte(']') // finish serialized slice

In success execution we will get something like:

%  ./db2json_ds 3 2>/dev/null | jq
[
  {
    "id": 1,
    "product": "product_1",
    "description": null,
    "price": 1.23,
    "qty": 10,
    "date": "2021-01-01 00:00:00"
  },
  {
    "id": 2,
    "product": "product_2",
    "description": null,
    "price": 2.23,
    "qty": 20,
    "date": "2021-01-01 00:00:00"
  },
  {
    "id": 3,
    "product": "product_3",
    "description": null,
    "price": 3.23,
    "qty": 30,
    "date": "2021-01-01 00:00:00"
  }
]

It’s a time of the moment of truth — fetch 10 million records:

% ./db2json_ds 10000000 1>/dev/null Elapsed time: 11894 ms, HeapAlloc = 2.436 MB, Sys = 11.710 MB 10000000 records read
Let’s compare with the starting point:

Execution time — 11.647 seconds instead 12.631 seconds;
Memory consumption — 11.710 MB instead 7099.447 MB. So, up to 10 percent faster and 600 times less memory consumption.

Conclusion

Let’s examine the broader scope of the tests.
Test comparison table (the length of result json file is 1.2Gb)

Benchmarks comparison table:

Real test memory consumption comparison table:

The fast method (fetch data using slices) is not the fastest in benchmarks, but it’s fastest and the least gluttonous in real tests.

Quantitatively, the slice-based approach delivers:

Performance comparison: Full 10M-row export finishes in 11.647 s versus 12.631 s for the internal library method (Standard) (~7.8% faster).
Speed comparison: Throughput grows from 0.79 million to 0.86 million rows per second (~8.4% gain).
Memory consumption: Peak heap usage shrinks from 7.1 GB to 11.7 MB (≈99.8% reduction, about 600× smaller).

Analyzing the results, we can make two important conclusions, in my opinion :

simple solutions always work more effectively;
real tests are always more important than synthetic ones.

Happy coding 🙂

Oleh Nahornyi

Three ways to fetch SQL data in Go