tamugd-parser

This project is dedicated to helping analyze the massive amounts of data released every semester by Texas A&M University's Registrar's office.

Features:

Parse grade report PDFs published by Texas A&M University's Registrar:
- Automatically adds parsed course data to a mySQL database.

Version 2.0 Roadmap:

~~Grade report parsing~~
~~Automatically add data to a mySQL database backend~~
~~Full rewrite with multithreading~~
~~Fully automated grade report updates (auto add new reports)~~

How to set up:

Open MySQL:
```
# open sql prompt
$ sudo mysql
```

Create mySQL database and user:

mysql> CREATE DATABASE database_name_here;
mysql> CREATE USER 'database_user_name_here'@'localhost' IDENTIFIED BY 'database_user_password_here';
mysql> GRANT ALL PRIVILEGES ON database_name_here.* TO 'database_user_name_here'@'localhost';
mysql> FLUSH PRIVILEGES;
mysql> exit;

Install dependencies:

# automatically install python dependencies
$ screen -SRD tamugd-parser
$ python3 -m venv venv
$ source venv/bin/activate
$ pip install -r requirements.txt

Generate prefs.json and update file:

# run in tamugd-parser/
$ python3 src/main.py
$ nano prefs.json

Notes

By default, this tool only processes PDFs that are CURRENTLY available from the Registrar but it supports parsing PDFs from 2012 onwards (perhaps also older ones, but they have not been tested). It will automatically download PDFs as needed. If you have additional PDFs you can supply your own in the pdfs/ folder however you must use the -s or --start-year flag.

Run python3 src/main.py --help for more detailed information.

Examples

# Process all PDFs from the Registrar
$ python3 src/main.py

# Process PDFs from 2014 to present
$ python3 src/main.py --start-year 2014

# Process PDFs from 2014 to 2018
$ python3 src/main.py --start-year 2014 --end-year 2018

Once the script is running you can monitor its progress by using the following command:

# Building the database will take a while...
# Detach screen with CTRL+A then CTRL+D while running the main script
# Then run this to display a live feed from the newest logfile
$ cd logs && tail -f $(ls -t | head -1)

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
.github		.github
.vscode		.vscode
src		src
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.pylintrc		.pylintrc
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.github

.github

.vscode

.vscode

src

src

.editorconfig

.editorconfig

.gitattributes

.gitattributes

.gitignore

.gitignore

.pylintrc

.pylintrc

LICENSE

LICENSE

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

tamugd-parser

Features:

Version 2.0 Roadmap:

How to set up:

Notes

Examples

About

Contributors 2

Languages

License

adibarra/tamugd-parser

Folders and files

Latest commit

History

Repository files navigation

tamugd-parser

Features:

Version 2.0 Roadmap:

How to set up:

Notes

Examples

About

Topics

Resources

License

Stars

Watchers

Forks

Languages