Breaking a string into tokens

Breaking a string into tokens#

Is also called split, e.g., in Python and C#.

char *strtok(char *str, const char *delim):

Example:

#include <stdio.h>
#include <string.h>

char text[] = R"(
How slowly the time passes here, encompassed as I am by frost and snow!
Yet a second step is taken towards my enterprise. I have hired a
vessel and am occupied in collecting my sailors; those whom I have
already engaged appear to be men on whom I can depend and are certainly
possessed of dauntless courage.

Adieu, my dear Margaret. Be assured that for my own sake, as well as
yours, I will not rashly encounter danger. I will be cool,
persevering, and prudent.
)";

int main() {
  char *token;

  token = strtok(text, ".");

  while (token) {
    puts(token);
    token = strtok(nullptr, "."); // Note the `NULL`.
  };
}

Output:

How slowly the time passes here, encompassed as I am by frost and snow!
Yet a second step is taken towards my enterprise
 I have hired a
vessel and am occupied in collecting my sailors; those whom I have
already engaged appear to be men on whom I can depend and are certainly
possessed of dauntless courage


Adieu, my dear Margaret
 Be assured that for my own sake, as well as
yours, I will not rashly encounter danger
 I will be cool,
persevering, and prudent

Warning

strtok modifies the string it gets. It must not be a const string.

For example declaring the string as char *text will lead to a memory error SIGSEGV: unvalid permissions for mapped object, because the right side of the pointer becomes a constant string.

  1. On the first call we provide the text, in the next calls we provide nullptr.

  2. If we provide nullptr, strtok continues searching for tokens.

  3. strtok returns nullptr, if no further tokens can be found.

Activity 50 (Splitting into paragraphs and words)

  1. Modify the program above to separate it into words.

  2. How would you split into paragraphs?