C++ Program Parsing and Tokenizer Using Classes Project
C++ parser and tokenizer using classes
Design and implement a C++ program to process commands (to parse and tokenize) using classes for that purpose. You will use these classes to extract the passages that form the story from a much larger input file.
Details
You will write code to parse out the different passages in a work of interactive fiction. Reading in and interpreting text is often referred to as parsing. The first step in parsing input is to tokenize the input; that is, break it down into smaller chunks, called tokens, which can be analyzed, and the string of tokens can then be interpreted by the parser. For exceptionally complex input, this tokenization process may even be multi-level, with one tokenizer breaking the initial input into coarse tokens that are then fed into another tokenizer to be broken down into smaller tokens.
Objective
Your goal is to write a pair of classes to tokenize the passages in interactive fiction stories. The “main” class, StoryTokenizer, will take in the text of an interactive fiction story (often stored in HTML files), which it will then break up into PassageToken objects, each of which represent one passage in the IF story (similar to a chapter).
Text Tokens
Interactive fiction works are divided into passages, which appear inside the HTMLtag
Your StoryTokenizer should have two member functions: hasNextPassage and nextPassage. As can be inferred from the name, hasNextPassage returns whether the story contains another passage (i.e., one that has not been read in yet), while nextPassage returns a PassageToken object describing the passage. It should also have a constructor that accepts a string containing the story to tokenize.
PassageTokens should have two member functions, getName and getText, as well as an appropriate constructor. The getName member function should return the name of the passage, specified as by the name attribute of the starting
Assembling the Code
You have been provided with a main function that will read in a story from input.txt and use your StoryTokenizer and PassageToken classes to break down that story into its constituent passages. Your tokenizer should appropriately ignore any text in the input file that is not part of a passage. You have also been provided with a couple of example input file you can use to test your tokenizer.
Though there is more than one way to implement your tokenizer, you may wish to take advantage of the find, substr, and/or at member functions of the string class when implementing your code. Check the online documentation (www.cplusplus.com) for more information.
Output
You should submit header and source files for your StoryTokenizer and PassageToken classes as a zip archive. You may combine both of them into a single header and single source file, or you may submit two of each. If you do not combine the headers together, you should #include the PassageToken header at the top of your StoryTokenizer header (storytokenizer.h).
The output should be the same as the other example. Opening this file in a web browser will allow you to play through the story.
“Place your order now for a similar assignment and have exceptional work written by our team of experts, guaranteeing you A results.”
Attachments
20190601182814storytokenizerdotcpp (1kB)
20190601183929example_storydothtml (102 kB)
20190601183940example_story_simple (2 kB)
20190601184002storytokenizer_output (2 kB)