C++ is not well liked by a large number of non-C++ programmers. One of the arguments that has stood the test of time against C++ is the iostreams library. It’s even a point of contention within the C++ community. The main arguments against the entire iostreams library can usually be summed up as:
- Overloading operators
<<
and>>
- Performance
- “God Objects”
We’re going to ignore that first bullet point however, because it has been argued to death and back. Developers can argue all day about overloading and what should and should not be overloaded (and they usually do)
However, iostreams are (to say the least) less performant than their C stdio counterparts. Enough to warrant a variey of StackOverflow posts about it. Out of all the classes and interfaces in the C++ standard library, the iostreams library is the least modern (and indeed the one that received the least attention during the C++11 and C++14 revisions). Among these issues are inheritance, virtual functions, and a lack of user defined allocator support at type instantion time. Out of all the C++ standard libraries, iostreams are the least C++, and arguably the most over-engineered. C++’s motto is “pay for what you use”, and a C++ programmer shouldn’t have to pay such a heavy tax for something as simple as reading bytes from a file, string, or other resource.
The last bullet point, that iostreams are God Objects, is the most important.If
there is one issue that I have with C++ streams, it is that they do too much.
They do formatting and scanning of text, reading and writing of binary data,
the handling of locale information (but not in a way that doesn’t require the
use of something Boost.Locale to get it right), buffer overflow and buffer
underflow, position seeking, and worst of all, the very nature of a stream
results in an overcomplicated user-defined stream insertion/extraction
overload. When we rely on Argument Dependent Lookup and the ostream& operator <<
overload, we end up in a bit of a pickle. What happens if we want our
user-defined type to be output as binary instead of text? How do we, as the
user of this library, decide if we want to have a binary formatted output
function or a text formatted output function? There is no good (or rather easy)
answer. What ends up happening is a user provides an ostream& operator <<
so
they can just dump text to the console or to a string for debug information.
There is no way to say “When writing to this resource, we treat it as binary
data. When writing to a different one, we will use text formatting for printing
log information”. We can’t just let ADL kick in and take care of the rest for
us. It is for this reason that we end up with libraries like cereal and
Boost.Serialization, so that developers can explicitly state “here’s how we
store our data”, even if it is for the smallest of utilities.
How do we solve this? The committee won’t (or maybe can’t, backwards compatibility is a big deal). The streams library is here to stay. But, we need a better alternative. One that lets us rely on ADL, that doesn’t focus on the use of inheritance or virtual functions, that works in a way that lets the user decide how they read and write binary data vs formatting or scanning text data, or even printing.
What we need are better, more generic I/O facilities for C++. We need a library that splits up the different Concepts of Resources, Readers, Writers, Streams, Formatters, Scanners, Buffers, and even smaller concepts such as position Seeking within a Resource.
Something like this wouldn’t replace serialization library’s like Boost.Serialization, but would be a better foundation for their output.
I’ve looked to other programming languages for inspiration on what a potential Modern C++ I/O API might look like. Rust has some pretty good concepts, some that would even map 1 : 1 with C++. However, there are some rust specific language features that they rely on. They do not make a difference between Read/Write and Format/Scan, nor do they treat each possible Resource as one, opting instead of write one Reader for each possible Resource (MemReader, FileReader, BufReader, etc). Even Java has some decent concepts (allowing for a writeObject function), however because Java is “all aboot the oop”, it relies on a user defining a class to handling these Read/Write vs Format/Scan as well as its builtin reflection system.
With these ideas in mind, I was able to develop some brief concepts that a Modern C++ I/O API should express. First, we have our verbs. These are the functions that are used for ADL to allow a generic approach to perform actions on those types which meet our Concepts. All of them relate to either a Concept, or are taken from a C stdio-like name.
- read
- Read binary data from a Resource
- write
- Write binary data from a Resource
- scan
- Read text from a Resource
- format
- Write text to a Resource
- open
- Opens a resource for I/O operations
- close
- Closes a resource for I/O operations
- flush
- Flushes a Buffer to its Resource
- sync
- Synchronizes a resource with the operating system if possible
- tell
- Gets the current position for I/O operations
- seek
- Sets the current position for I/O operations
- skip
- Read and discard data from a Resource until a non-white-space or given character is encountered. Use only on Scanners
- Output text to
stdout
orstderr
We now need to express our Concepts. These would have an equivalent type trait available to check if a given type meets one of these requirements, allowing for SFINAE within APIs that rely on them.
- Reader
- Can read from a Resource
- Writer
- Can write to a Resource
- Stream
- Is both a Reader and Writer
- Pipe
- Holds both a Reader and Writer. For every Read, there is a Write
- Scanner
- Can scan from a Resource
- Formatter
- Can format to a Resource
- Channel
- Is both a Scanner and Formatter
- Filter
- Holds both a Scanner and Formatter. For every Scan, there is a Format
- Resource
- Represents a data source or data target
- Buffer
- Manages a Resource by buffering I/O operations
- Seeker
- Allows moving the current I/O operation position of a Resource
This kind of approach would work well for mixins, as well as allowing
user-defined resources. For instance, implementing a Reader-only Resource for
the SQLite3 Blob object would be as simple as implementing a basic read
function that wraps the sqlite3_blob_read
function. One would also be
able to hook in to additional types in other libraries, such as libsdl’s
SDL_RWops. The possibilities for inter-library interaction are numerous,
possibly limitless.
If there were a library that expressed these concepts, it would most certainly make C++’s approach to I/O competitive with other languages.