OK, so this tip thing doesn't look like it's going to be weekly. But hey, close enough right?
This week I want to talk about two classes that are both part of the C++ Standard Library, both do similar things and yet are fundamentally different. String streams.
The modern way
If you write decent C++ you may already be familiar with std::stringstream
from the <sstream>
header. It provides a buffered stream interface over the standard std::string
object, allowing very easy manipulation of the underlying buffer. In particular, it's often used due to the large number of conversion opportunities exposed by the stream interface.
For example if you wanted a numeric string in standard C, you'd probably have done something like this:
Annoyingly, you had to pre-allocate the amount of memory that you thought would be required. In the example above that's easy: I knew I needed space for three characters and the terminating NUL
, so I allocated four characters for the C-string.
But it wasn't always that easy. If you wanted to take unpredeterminable input and put that into a string, you'd have to simply guess at the space required, and if there wasn't enough that was just too damn bad. But hey, at least you could put almost anything into it thanks to printf
and its friends.
Then along came C++ with its standard string
wrapper, with its dynamic allocation and, most importantly, dynamic resizing. You can concatenate with a single operator and you can add as much data to it as you like without worrying about running out of pre-allocated buffer space.
// C++ std::string stdstr = "Hello "; stdstr += "world";
std::string
concatenation is great. But the class is not very good at implicit conversion:
Thankfully, the streams interface is great at this as it has all sorts of varieties of conversion loaded into the <<
operator. Since stringstream
is a stream buffer built over a string, we can use it to easily manipulate the underlying string in ways the string itself would never allow us to:
Of course, the above is a silly simplistic example as we could have written std::string str = "Hello 5 worlds"
directly, but the technique is useful when you don't know in advance what that number's going to be:
The simplicity of this approach seems like a lifesaver, and is in fact used all the time when people would in the past have used sprintf with a fixed C-string buffer.
However, there is one oft-overlooked flaw with this approach. stringstream.str()
returns a copy of the string buffer, not a reference. In fact, there is no way to get a reference to the string buffer of a stringstream
. This means that every time you pull a string from a stringstream, the data is copied in memory.
It might not seem like such a big deal unless you're frequently creating a stringstream
purely to use its conversion facilities, then grabbing the underlying string for further use. You're wasting memory and CPU cycles.
Looking backwards
There exists a standard alternative that a lot of people don't know about, with the similar name discouraged by experts for at least eight years.
This recommendation may seem a little premature when you consider that strstream
is not being dropped from the upcoming C++0x and the next standard version after that is not expected until we approach 2020.
But more importantly, what most of these experts opt not to mention is that the underlying data of a strstream
is an old C-style character array rather than a C++ string
object. Because of this, we get direct access to the data without having to go through an protective layer of abstraction.
Specifically, where stringstream.str()
gives us a copy of a string object (which copies the string), strstream.str()
gives us a copy of a pointer to characters (which does not).
We'd still have to create a copy of the C-style string if we wanted to use all the functionality of C++ strings because std::string
doesn't give us a choice, but now we have a C-style string that wasn't copied and we can do what we like with it.
A complicated manipulation of strstream's underlying buffer might look like this:
char* c = ss.str(); memcpy(c+8, "!", sizeof(char)); cout << ss.str(); // Output: "HI WORLD!"
So it's not beautiful; but it does demonstrate the added power of direct stream buffer access.
If you like the options provided by the stream interface and find yourself concerned that you're copying string data needlessly, or have a need to modify the underlying buffer data, stop and think for a moment before throwing strstream
mericilessly to the hounds of time. Because it has use yet.
Bootnote
I apologise for the use of the past tense when referring to C. Yes, I know the language is still very much alive and kicking and that plenty of people still use it. However, in this article's C++ context it's merely a precursor. So you'll just have to get used to it.