CSCE-315: Programming Studio (Fall 2020)

Project 2: Database

Due dates and updates

Here are the various due dates. See near the end for details (i.e., what you need to submit for each submission window.

[Phase 1] Design documents : due 9/20 Sunday 11:59pm
[Phase 2] JDBC-based client: due 9/27 Sunday 11:59pm
[Phase 3] Java Swing-based GUI : due 10/7 Wednesday 11:59pm
[Phase 4] Dashboard + 400-word Retrospective : due 10/18 Sunday 11:19pm

Any updated info about the project will also be posted here.

Skeleton code: https://github.tamu.edu/choe/315-db

Team Assignment

The team assignment will be made based on the team assignment survey. Default team size is 4 to 5. See piazza posting for the team assignment. Schedule a regular meeting with your team mates ASAP.

In a nutshell

In this project, your team is given a large SQL database called AdventureWorks. AdventureWorks is an online transaction processing (OLTP) database for a fictional retail business, made by Microsoft. It has over 110,000 transactions, over 18,000 people's info, etc.

Your task is to design and develop a GUI-based database client and a database analytics platform for this database. This project also includes an individual writing component (400-word project retrospective).

Provide GUI-based interface to the database for most common queries that may arise in the domain.
Provide data analytics function (compute various statistics from the database, such as demography, sales statistics, regional statistics, seasonal sales, etc.), and a dashboard to display key information from the database.

Resources:

Development environment

The development environment of this project is as follows:

Programming language: Java
Database server: MySQL database by Oracle (running on your local docker)
GUI: Java Swing (you may use the Netbeans IDE, but only if you're already familiar with it. Instructor/TAs will not provide guidance for Netbeans.)
Database connectivity: JDBC (Jave DataBase Connectivity)
Testing: JUnit

PART I: Database

Phase 1. Design document

First, set up your MySQL server environment and load up the AdventureWorks database to the server (see the README file in 315-db repo). Note that although phase 1 is mostly about the design document, you must play with the actual MySQL server with the AdventureWorks DB loaded in order to write the specification.

Once you're ready, you should explore the database: what tables it has, what are the columns, what are the keys, etc., and how they are all connected. While doing this, think about what kind of queries may be helpful to a user. Based on this initial hands-on investigation of the data, you will produce the following:

Key functionality your DB client will provide
An ER-diagram of the major subset (this has to be substantial) of the DB that your DB client will support.
List of major SQL queries you plan to use, and example output.
- At least 20 different kinds of queries.
- Most queries must involve multiple tables (JOIN).
- Aggregate functions such as COUNT, SUM, AVG, etc. must be included at least 5 times.
- GROUP BY statement must be used at least 2 times.
- ORDER BY statement must be used at least 2 times.
- LIKE operator must be used at least 1 time.
Overall system design: Must include the following and how they interface with each other.
- JDBC-based database connectivity.
- Java GUI
- Dashboard
- Major class names and how they are expected to interact with each other (flow of information and control).
Major milestones (must be more detailed than a simple list of the 3 submission phases).

* Example Queries *

show tables;

show columns from customer;

select * from customer where AccountNumber="AW00029475";

select e.employeeid, e.addressid from employeeaddress e inner join address a using(addressid);

select count(orderdate) from salesorderheader group by orderdate order by orderdate;

select * from information_schema.columns where column_name like '%productphotoid%'

Phase 2. JDBC-based client

JDBC allows you to connect to a DB server (e.g., a MySQL server or Postgres server, etc.) and send queries, get the results back, and update the entries. JDBC Basics, by Oracle is a good place to start.

For this phase, your team will develop basic function in a Java application to serve as the DB client. The specifics may vary depending on your design from phase 1, but the Java application should demonstrate the following functions and requirements.

Connect to MySQL server, with proper authentication credentials, and disconnect properly.
The main user interface is a simple command line interface, taking user input from keyboard and printing out the results on the screen. For this, you will need to write a basic command line interpreter (this will have a really simplt syntax, so you can just use as switch statement).
- All SQL commands should be be accepted as is (in READ-only mode -- no creation of tables, inserts, etc.). (Just pass on the string to the server and display the returned results.)
- In addition, implement the custom commands shown below. (Note: This seems too many, but these mostly correspond to finding the right SQL query and passing on to the server.)
  - jdb-show-related-tables <table-name>
    Given the table <table-name>, list all other tables that have in their column one or more of the primary keys of the table <table-name>.
  - jdb-show-all-primary-keys
    Show all primary keys from all tables. Print the list of (table_name, column_name).
  - jdb-find-column <column-name>
    Find all tables that have <column-name>
  - jdb-search-path <table1> <table2>
    Given two table names <table1> and <table2>, find the path of intermediate tables you have to go through from <table1> to reach <table2>.
  - jdb-search-and-join <table1> <table2>
    Given <table1> and <table2> that are connected through multiple other tables, produce a joined table where rows in table1 with multi-hop linkage to the entries in table 2 appear on the same row in the resulting joined table.
  - jdb-get-view <view-name> '(' < sql query > ')';
  - jdb-stat <table> (or <view-name>) <column_name>
    Given a <table> (or a <view>), return basic statistics such as min, max, mean, median, and print out a histogram (using ASCII characters). The x axis is frequency, and y-axis is the bins. Both axes should be scaled automatically to fit the histogram data.
```
  0__10__20__30__40__
0 |***
1 |*****
2 |************
3 |********
4 |**
5 |****
```
  - In addition to the above, at least 5 custom commands must be designed and implemented that may be used toward phase 3 and phase 4.
- Appropriate error messages must be shown when invalid commands are given, or nonexistent tables or columns are referenced.

Example command-line interaction scenario is shown below.

* Command-line interface and Custom commands to be Implemented*

Note: jdb> is the prompt. 
Note: all custom commands are prefixed by "jdb-".

jdb> show tables;

.... 

jdb> show columns from customer; 

....

jdb> jdb-show-related-tables employeeaddress;

...

jdb> jdb-show-all-primary-keys;

...

jdb> jdb-get-view SALESVIEW ( select (count ....) as SALESSTAT from ... join .... )
jdb> jdb-stat SALESVIEW SALESSTAT

Phase 3. Java Swing-based GUI

In this phase, your team will develop a Java Swing-based GUI interface for the core functionality you developed in phase 2. The GUI must include basic built-in function, such as

show list of tables
show one or more columns of a specific table
show the result of joining up to 4 tables
text entry field that accepts raw SQL queries and display the results.

AND, in addition, all custom commands from phase 2.

jdb-show-related-tables
jdb-show-all-primary-keys
jdb-find-column
jdb-search-path
jdb-search-and-join
jdb-stat
your custom commands

AND, further in addition, implement:

jdb-plot-schema
Using the Graphviz library for JAVA, display the DB schema. (https://github.com/nidi3/graphviz-java; Also see https://graphviz.org/, the original library). See Example schema plot in PDF in the 315-db repo

Phase 4. Dashboard

This part is intentionally left open-ended. Using what you've built so far, evolve your DB GUI client to have simple data analytics functions with a dashboard feature. Here's an example and some insight on business dashboards. Some ideas include:

Weekly, Monthly, and Yearly statistics: number of orders, total sales amount, number of customers.
Employee statistics: histogram by age, total salary, mean and median salary, etc.
Regional statistics: above info, by state
Product demand by season: top 10 popular products during months X to Y.
Customer demographic: by region, by age, by gender etc.

You may use any open-source or third-party graphing/charting libraries for this phase. For exmaple, see XChart,

Phase 4 (writing component). Development Retrospective Report (400 words)

Note: This writing report is an individual assignment, so everyone must submit separate from the coding project itself. A separate submission stub will be created on eCampus for this. As mentioned in the syllabus, this course is a writing intensive course, and as an integrat part of the course design, you are required to write a substantial article. This writing assignment will be 400 words, which is about 1 page of single-space text. In your article, include the following (no need to address all the details listed here: choose the more important ones in your view):

Project overview: very brief summary of the project.
Team experience: how was the team experience, was the project engaging, did the project give you a chance to innovate? What were the challenges?
Product analysis: How would you rate the quality of your final product? What did you learn that you could apply to future work? What would you have done differently?
Work analysis: Are you proud of the work? What went well? What could be improved?
Lessons learned: What valuable lessons have you learned through this process?
Your contribution: Briefly summarize your technical contribution, and what percentage effort you gave (25% for a team of 4 when everyone contributed equally).

Writing resources:

James A. Bednar's Writing tips.
Qiang Hao's Bits and Pieces blog post on developing Google Docs Add-on in Two weeks (same as link in the text above).
Google doc grammar correction feature: When drafting an article for ALL your writing assignments for this course, do so in Google doc and use this grammar correction feature extensively.

Deliverables and Requirements

Use of github is mandatory. http://github.tamu.edu. Give access to the TA.
You must maintain a development log (wiki page in github.tamu.edu titled "Development log") updated by you. This log will be graded. There is no designated format, except that you need to time stamp and write a brief description of the activity. We will check your daily progress. Note: This is a free-formatted notebook, different from the development blog writing component. However, if you write down your experience and your activities, it'd be easier to write your development blog article.
Demo in the lab may be required.

Intermediate submissions:
1. 10%: layout, style, comments
2. 30%: implementation (regardless of actual functioning, but must be substantial)
3. 50%: function (properly functioning)
4. 5%: development log
5. 5%: regular use of github (semi-daily commits)
Project Final Grading (Coding): 100/100, worth 25% of your Final course grade). Rubrics are as follows:
1. 5%: layout, style, comments, development log, github use
2. 20%: phase 1 grade
3. 20%: phase 2 grade
4. 20%: phase 3 grade
5. 20%: phase 4 grade
6. 15%: improvements on phase 1-3 in final submission.
Project Final Grading (Writing) (100/100, worth 5% of your Final course grade)
1. 80/100 of the grade for this assignment will be based on what you submit for the final project deadline.
2. 20/100 of the grade for this assignment will be based on a reivision of your article after you have received feedback from the instructor/TA.
3. Rubrics (motivated by Utha State University rubric:
  - Content, structure, and organization: 70/100 (each item 10 points each)
    1. Content is effective and clearly organized
    2. Article follows logical, consistent pattern of development of ideas
    3. Paragraphs discuss one main idea
    4. Paragraphs are appropriate in length (5 +- 2 sentences)
    5. Each sentence contains one main idea.
    6. Transitions between paragraphs are smooth.
    7. Appropriate references are made (5 or more links to resources are mandatory, put in a "Reference" section at the end).
  - Spelling, grammar, punctuation: 30/100.

Submission

More detailed submission instructions will be posted on piazza.
All submissions should be through ecampus.tamu.edu
First, branch your latest project into an archival branch named: Submission 1, Submission 2, and Submission 3, etc. for the code submissions, respectively.
Include all files, other code, and related media for your project in a single zip file.
Use the "Download ZIP" feature in github and upload the resulting zip file. Include all documents like development log, etc. in the zip file (you can add these after you download the zip from GitHub).
Standard late penalty applies.
File naming convention:

Main zip file: 315-fall2020-<section>-<team-ID>-prj2-<submission #>.zip (submission # = 1, 2, 3, 4, etc., for weekly submissions, team-ID = 1, 2, 3, 4, etc., respectively).
Include a README file which explain what each file is for.
if you're using docker, all docker-related files (*.yml, phpdocker/, etc.): entire docker directory, so that the container can be instantiated and tested.
Any deviations from these naming will each result in -1 (out of 100) for that submission.