Q1 LA City Employee Payroll

The /home/m280-data/la_payroll/LA_City_Employee_Payroll.csv file on teaching server contains payroll information of LA City employees in years 2013-2017. It was downloaded from LA City Controller’s Office. Make a Shiny app to facilitate exploratory data analysis.

  1. For efficiency of the Shiny app, you should first pre-process, pare down, tidy, and save the data, e.g., as a compressed RDS file, to be used in the app.

  2. Total payroll by LA City. Visualize the total LA City payroll of each year, with breakdown into base pay, overtime pay, and other pay.

  3. Who earned most? Visualize the payroll information (total payment with breakdown into base pay, overtime pay, and other pay, Department, Job Title) of the top \(n\) highest paid LA City employees in a specific year. User specifies \(n\) (default 10) and year (default 2017).

  4. Which departments earn most? Visualize the mean or median payroll, with breakdown into base pay, overtime pay, and other pay, of top \(n\) earning departments. User specifies \(n\) (default 5), year (default 2017), and method (mean or median, default median).

  5. Which departments cost most? Visualize the total payroll, with breakdown into base pay, overtime pay, and other pay, of top \(n\) expensive departments. User specifies \(n\) (default 5) and year (default 2017).

  6. Visualize any other information you are interested in.

  7. Publish your Shiny app to https://www.shinyapps.io and share the link.
    My app is located at https://emjcampos.shinyapps.io/la-payroll/.

Q2 LA City Parking War

The SQLite database /home/m280-data/la_parking/LA_Parking_Citations.sqlite on teaching server contains information about parking tickets in LA City. It was downloaded from LA Open Data Portal. Connect to the database and answer following questions using plots and summary statistics. In this exercise, you are not allowed to load whole data into memory. Use the transform in database, plot in R strategy.

The data set looks like this:

  1. How many tickets are in this data set? Which time period do these tickets span? Which years have most data?

There are 4,044,488 tickets in this data set. The parking tickets span 4/27/2010 to 12/30/2017. From the preceding plot, we see the most data is from 2015-2016.

  1. When (which hour, weekday, month day, and month) are you most likely to get a ticket and when are you least likely to get a ticket?
Time Highest Tickets Lowest Tickets.
Hour 12 506580 5 18317
Weekday 3 796643 7 125327
Day of the Month 22 144331 31 75464
Month 3 392473 11 177039

  1. Which car makes received most citations?

The make with the most citations is TOYT.

  1. How many different colors of cars were ticketed? Which color attracted most tickets?

There are 66 colors and BK attracted the most tickets.

  1. What are the most common ticket types?

The top 10 ticket types are depicted with their counts in the following bar plot:

  1. How much money was collected on parking tickets in 2015 and 2016?

The total amount of fines collected on parking tickets in 2015 and 2016 was $274,242,930.

  1. Visualize any other information you are interested in.

Most fines are concentrated under $100, but there are upticks around $160 and $360.