Lab 3 (WC Utility)

Programming Workshop 2 (CSCI 1061U)

Winter 2021

Faculty of Science

Ontario Tech University


Introduction

You are asked to develop a clone of the wc utility found in most Unix systems. The wc utility displays the number of lines, words, and bytes contained in each input file.

Consider a text file hello-world.txt:

me: hello world
me: how do you do?
world: i am doing great.
me: are you really?  what about global warming?

When we use wc utility to count the number of lines, word, and bytes in hello-world.txt, we get:

$ wc hello-world.txt
    4       21      108     helloworld.txt

Multiple files as input

Now consider file pablo-neruda.txt with contents as follows:

One Hundred Love Sonnets: XVII
BY PABLO NERUDA
TRANSLATED BY MARK EISNER
I don’t love you as if you were a rose of salt, topaz,
or arrow of carnations that propagate fire:
I love you as one loves certain obscure things,
secretly, between the shadow and the soul.

I love you as the plant that doesn’t bloom but carries
the light of those flowers, hidden, within itself,
and thanks to your love the tight aroma that arose
from the earth lives dimly in my body.

I love you without knowing how, or when, or from where,
I love you directly without problems or pride:
I love you like this because I don’t know any other way to love,
except in this form in which I am not nor are you,
so close that your hand upon my chest is mine,
so close that your eyes close with my dreams.

If we use wc utility on both files, it produces the following output:

$ wc hello-world.txt pablo-neruda.txt
       4      21     108 hello-world.txt
      19     149     806 pablo-neruda.txt
      23     170     914 total

Task

Part 1 (70%)

Develop a wc utility in C++ that can handle a single file (as seen above). Put this code in file part1.cpp.

Part 2 (30%)

Now extend your code such that it can handle multiple input files (as seen above). Put this code in file part2.cpp

Things to consider

A few items to remember:

  1. A word is defined as a string of characters delimited by whitespace or newline characters. Whitespace characters are the set of characters for which the iswspace(3) function returns true. Newline characters are \n and \r.
  2. A line is defined as a string of characters delimited by a newline character.
  3. Characters beyond the final newline character will not be included in the line count.
  4. You are not allowed to use the built-in wc to do the work for you.
  5. You are encouraged to use STL fstream, iostream, and string in your code.
  6. For part 2, you are encourage to consider using C++ structures to store file specific information.

Submission

Please submit part1.cpp and part2.cpp file via Canvas.