Post

Taking a Closer Look at io.SectionReader

In the comprehensive collection of Go’s built-in packages, the io package holds a special place with its wide-ranging functionality. Today, we’ll be zooming in on a specific feature within this package the io.SectionReader type.

Here is the myfile.txt file that is being used throughout the article.

myfile.txt

1
Hello, world! This is an example of using SectionReader in Go

What is SectionReader?

The io.SectionReader type in Go constructs a new reader that allows reading from a specific section of an underlying data stream, such as a file. You can envision it as a “window” into a data stream, where you have defined the starting and ending points of the view.

The structure is as follows:

1
2
3
4
5
6
7
8
// SectionReader implements Read, Seek, and ReadAt on a section
// of an underlying ReaderAt.
type SectionReader struct {
	r     ReaderAt
	base  int64
	off   int64
	limit int64
}

And it’s constructor function is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
// NewSectionReader returns a SectionReader that reads from r
// starting at offset off and stops with EOF after n bytes.
func NewSectionReader(r ReaderAt, off, n int64) *SectionReader {
	var remaining int64
	const maxint64 = 1<<63 - 1
	if off <= maxint64-n {
		remaining = n + off
	} else {
		// Overflow, with no way to return error.
		// Assume we can read up to an offset of 1<<63 - 1.
		remaining = maxint64
	}
	return &SectionReader{r, off, off, remaining}
}

This function takes a ReaderAt, an offset and a number of bytes as arguments and returns a *SectionReader. The returned SectionReader behaves exactly like the original ReaderAt with the added behavior of only allowing to read from a specific section of the underlying data defined by the provided offset and number of bytes.

The function also checks for possible overflow to ensure the offset plus the number of bytes does not exceed the maximum value for an int64, providing a safeguard against potential errors.

Why Use SectionReader?

SectionReader offers significant benefits when dealing with large files or data streams. By focusing on only a specific section of data, you can save memory and CPU resources by not reading the entire file into memory. This becomes especially important when working with massive files that can’t comfortably fit into memory.

Additionally, SectionReader can be used to easily access and manipulate certain regions of a file without having to navigate manually. For instance, parsing file headers, reading metadata from a certain section of a file, or accessing a certain ‘chunk’ from a large data stream.

How Does SectionReader Work?

The SectionReader type in Go’s io package returns a structure called a SectionReader. This structure contains four key components:

  • r: This is the underlying reader from which data is read.
  • base: This is the starting offset from the beginning of the underlying reader.
  • off: This is the current offset from the beginning of the underlying reader.
  • limit: This is the maximum offset in the underlying reader up to where data can be read.
1
2
3
4
5
6
type SectionReader struct {
	r     ReaderAt
	base  int64
	off   int64
	limit int64
}

The SectionReader struct has a few methods, including Read, Seek, and ReadAt that operate on the section of the underlying reader r between base and limit.

Read

1
2
3
4
5
6
7
8
9
10
11
func (s *SectionReader) Read(p []byte) (n int, err error) {
	if s.off >= s.limit {
		return 0, EOF
	}
	if max := s.limit - s.off; int64(len(p)) > max {
		p = p[:max]
	}
	n, err = s.r.ReadAt(p, s.off)
	s.off += int64(n)
	return
}

The Read method is primarily responsible for reading data from the specified section. From the developer’s perspective, it offers convenience by automatically managing the offset and avoiding reading beyond the defined section.

When you use Read, you don’t need to worry about manually increasing the offset or checking if you’ve read the same bytes more than once. All you need to do is to provide a byte slice that Read fills with data. If you continue reading with the same SectionReader, it will continue from where it last left off, and once it has read to the limit of the section, it will return an EOF error.

Seek

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
var errWhence = errors.New("Seek: invalid whence")
var errOffset = errors.New("Seek: invalid offset")

func (s *SectionReader) Seek(offset int64, whence int) (int64, error) {
	switch whence {
	default:
		return 0, errWhence
	case SeekStart:
		offset += s.base
	case SeekCurrent:
		offset += s.off
	case SeekEnd:
		offset += s.limit
	}
	if offset < s.base {
		return 0, errOffset
	}
	s.off = offset
	return offset - s.base, nil
}

The Seek function lets the developer move the current offset to a desired location within the section. This is useful when you want to skip to a particular part of the section without having to read through everything before it.

Depending on the whence parameter, you can set the offset relative to the start of the section, the current offset, or the end of the section. Seek keeps you within the bounds of the section, returning an error if you try to seek before the beginning of the section.

ReadAt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
func (s *SectionReader) ReadAt(p []byte, off int64) (n int, err error) {
	if off < 0 || off >= s.limit-s.base {
		return 0, EOF
	}
	off += s.base
	if max := s.limit - off; int64(len(p)) > max {
		p = p[:max]
		n, err = s.r.ReadAt(p, off)
		if err == nil {
			err = EOF
		}
		return n, err
	}
	return s.r.ReadAt(p, off)
}

The ReadAt function provides more explicit control compared to Read. It reads from the underlying reader at a specific offset, without changing the current offset of the SectionReader. This can be useful when you need to read specific parts of the section multiple times or in a non-linear order.

The ReadAt function, like Read, also respects the limits of the section and ensures you don’t read beyond it.

Size

1
2
// Size returns the size of the section in bytes.
func (s *SectionReader) Size() int64 { return s.limit - s.base }

The Size function provides a simple way to get the size of the section in bytes. This can be useful when you need to know the length of the section for calculations or for determining the size of the byte slice to be read.

Using SectionReader

Read

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
file, err := os.Open("myfile.txt")
if err != nil {
	log.Fatal(err)
}

reader := io.NewSectionReader(file, 5, 100)

firstChunk := make([]byte, 10)
_, err = reader.Read(firstChunk)
if err != nil {
	log.Fatal(err)
}

fmt.Println(string(firstChunk))

secondChunk := make([]byte, 15)
_, err = reader.Read(secondChunk)
if err != nil {
	log.Fatal(err)
}

fmt.Println(string(secondChunk))

lastChunk, err := io.ReadAll(reader)
if err != nil {
	log.Fatal(err)
}

fmt.Println(string(lastChunk))

If the text in “myfile.txt” was “Hello, world! This is an example of using SectionReader in Go”, the output of this program would be:

1
2
3
, world! T
his is an examp
le of using SectionReader in Go

Here, we used Read function to read a section of the file starting from the 5th byte and stopping after 100 bytes. We read the first 10 bytes, then the next 15 bytes, and finally read the rest of the section. Read managed the offset for us, allowing us to focus on processing the data.

Seek

1
2
3
4
5
6
7
8
9
10
11
12
13
14
reader := io.NewSectionReader(file, 5, 100)

_, err = reader.Seek(20, io.SeekStart)
if err != nil {
	log.Fatal(err)
}

chunk := make([]byte, 15)
_, err = reader.Read(chunk)
if err != nil {
	log.Fatal(err)
}

fmt.Println(string(chunk))

This is output:

1
example of usin

We moved the offset to the 20th byte from the start of the section, and then read 15 bytes from there. So the output is the string starting at the 25th byte of the file (5 bytes of initial offset + 20 bytes from Seek) and is 15 bytes long.

ReadAt

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
reader := io.NewSectionReader(file, 5, 100)

chunk := make([]byte, 10)
_, err = reader.ReadAt(chunk, 20)
if err != nil {
	log.Fatal(err)
}

fmt.Println(string(chunk))

_, err = reader.Read(chunk)
if err != nil {
	log.Fatal(err)
}

fmt.Println(string(chunk))

This is output:

1
2
example of
, world! T

We used ReadAt to read 10 bytes at the 20th byte of the section (25th byte of the file), and then used Read to read the first 10 bytes of the section again. ReadAt didn’t change the offset, so Read started from the beginning of the section.

Size

1
2
reader := io.NewSectionReader(file, 5, 100)
fmt.Println(reader.Size())

This is output:

1
100

Conclusion

The io.SectionReader in Go is a powerful tool for reading specific sections of a file or data stream. It provides a way to read, seek, and read at a specific position in the section, making it a versatile tool for handling file operations. Whether you’re dealing with large files or need to process specific sections of data, the SectionReader can be a great asset in your Go programming toolkit. Its efficient design and easy-to-use interface make it a go-to choice for many developers.

Happy coding!

This post is licensed under CC BY 4.0 by the author.