Filter CSV by age

Demonstration of the program. Warning: `cat` is not available on Powershell. I used it to show the content of files, which you can use the editor for. I mistyped some of the `cat` commands and did not use them.

Filter CSV by age#

Implement a program that filters people by age from a dataset.

Input data is a comma-separated values (CSV) file. If there is a row which does not contain two columns, then output an error message.

Comma-separated values

plain text data format for storing tabular data where the values of a record are separated by a comma and each record is a line.

Milestones:

  1. First begin by processing people-with-age.csv line by line. If you encounter a blank line, then warn the user, but continue processing by printing the non-blank lines.

    ./main.exe people-with-age.csv
    
  2. Parse one line into tokens and filter by age

  3. Recognize the three different malformed lines

    1. Whole line missing

    2. Age missing (no token)

    3. Age not recognized (token cannot be converted to a number)

  4. Support output to a file

    ./main.exe people-with-age.csv out.csv
    
  5. Also support reading the input from stdin as follows:

    "Anna, 12" | ./main.exe
    
    type people-with-age.csv | ./main.exe
    
    echo "Anna, 12" | ./main.exe
    

people-with-age.csv:

Nuvi Våle, 18

Aeral Körn
Lumio Satō, 29
Veski Ruañ, 12

The output file must not contain any error messages.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define LINE_MAX 100
#define DELIM "," // CSV delimiter

char *ifile, *ofile;
unsigned filter_age_max;
FILE *istream, *ostream;

#define USAGE                                                                  \
  R"(Filters CSV rows, keeping only those with provided maximum age
%1$s max-age [input-file] [output-file]

Example: 
%1$s max-age input.csv output.csv
%1$s max-age input.csv (output to stdout)
%1$s max-age           (input from stdin, output to stdout)
)"

void filter_stream(FILE *istream, FILE *ostream) {
  char line[LINE_MAX];
  char *fgets_return;
  char *name, *age_str;
  size_t line_no = 0;

  while (!feof(istream)) {
    ++line_no;

    // Read a line from `istream` and assign the return value to `fgets_return`
    // YOUR CODE HERE

    if (fgets_return && *fgets_return != '\n') {
      if (strlen(line) > 1) {
        // Assign `name` and `age_str` using `strtok`
        // YOUR CODE HERE

        // Alternative to strtok:
        // sscanf(line, "%*[^,],%d", &age);

        if (!age_str) {
          // Error message
          // YOUR CODE HERE
          continue;
        }
      }
    } else {
      // Error message
      // YOUR CODE HERE
      continue;
    }

    // Age processing
    unsigned age;
    auto recognized_count = sscanf(age_str, "%d", &age);
    if (recognized_count == 1) {
      if (age <= filter_age_max) {
        // Forward input line to `ostream`
        // YOUR CODE HERE
      }
    } else {
      // Error message
      // YOUR CODE HERE
    }
  }
}

int main(int argc, char *argv[]) {
  switch (argc) {
  case 4:
    // max-age ifile ofile
    ofile = argv[3];
  case 3:
    // max-age ifile
    ifile = argv[2];
  case 2:
    // max-age
    if (!sscanf(argv[1], "%d", &filter_age_max)) {
      puts("First argument is not an age.");
      exit(EXIT_FAILURE);
    }
    break;
  default:
    printf(USAGE, argv[0]);
    return EXIT_SUCCESS;
  }

  if (ifile) {
    // Open `ifile` and assign it to `istream`
    // YOUR CODE HERE

    // Exit program with an error message if file cannot be opened
    // YOUR CODE HERE
  } else {
    // Set `ostream` if no file provided
    // YOUR CODE HERE
  }

  if (ofile) {
    // Open `ofile` and assign it to `ostream`
    // YOUR CODE HERE

    // Exit program with an error message if file cannot be opened
    // YOUR CODE HERE
  } else {
    // Set `ostream` if no file provided
    // YOUR CODE HERE
  }

  filter_stream(istream, ostream);
}

Requirements:

  • Deliver a very general flowchart where at least the line by line processing is visible, but not how each error is recognized.

Some concepts used from previous chapters: