The program looks for two files, named trainingSet.txt and testSet.txt, and outputs the results to stdout. If these files don't exist, the program will crash. Specs for these files are found here: It is built in Python 3, so running them should be simple on most computers:

python3 main.py

About

Construction

The program is split into a library and executable. All the computations and processing happen in lib.py, while the executable, main.py just calls the correct functions and provides a CLI.

The code has been tested for compatibility down to Python 3.3.2. It may work with older versions, but they are not supported. ⚠️ The script will not work on Python 2.

Accuracy

💹 In our tests, the sentiment analyzer achieved 79.7% accuracy. Some ideas for improving accuracy are:

Increasing our training data
Spell-correcting mispelled words
Trimming excess characters. For example soooooooo good should be considered the same as soo good.