Data Visualization

You are required to create a project and present this project to the class at the end of the semester.  You may work in teams (size to be determined in class.

Your task is to select a dataset and tell a visual story about this dataset that is meaningful in whatever context that you define.  This includes both exploration and explanation.  You should visually explore the dataset for possible relationships among variables, or possible trends (if applicable), or possible insights.  You should then  present your conclusion, your "sales pitch", or  make your point in a visually persuasive or effective manner .  The specific visualizations will depend upon the nature of your data and  the story that you  want to tell.  


Consider the dataset you select for the project.  You can use almost any dataset, with my approval.  However, this course isn’t focused on the data cleaning process.  I would suggest the following criteria:

Include the following application requirements and technical specifications for the project:
Assume that this is a project for an individual, a group, a business or an organization.

Presentation requirements of the visual explanation and explanation of your dataset:
  1. What is the title of this presentation?  The title should capture the focus of your visualizations.
  2. What is the organization/individual for whom this presentation is prepared?
  3. What is the mission of this organization (or, for an individual, what is the underlying interest of the individual in pursuing this applications domain)? 
  4. What is the purpose of this project?
  5. Why will this project support the mission of the organization/individual?
  6. Tell us about your dataset:
  7. Walk through your exploration process.  I don't just want to see a series of slides with tables, graphs and charts.  I want to see how you used these visualizations to explore your dataset, to get a better understanding of the data, and show us how you progressed from one visualizaiton to the next.  What question was answered, or what new question was raised that inspired you to try another kind of graph or chart?  Were things too cluttered, and you needed clarification? Did you notice a trend or a relationship and decide to explore that further?  Did you want more detail?  Less detail?  A different type of visualization?  
  8. Discuss the technical details of your choices:  Explain your use of color, tick marks, annotations, titles, axes scale, or any other feature that you included.
  9. After showing us your visual exploration of the dataset, visually present some conclusions, talking points, or other explanatory visuals.

Technical requirements:
  1. Include at least one clustered bar or column chart.
  2. Include at least one stacked bar or column chart.
  3. Include at least one scatter plot.
  4. Include one graph or chart using RBase graphics.  (A hstogram or boxplot might work well, but you are not limited to those.)
  5. Include one line or area graph.  It may not be appropriate for your data.  Do it anyway, and explain why it is or is not appropriate.
  6. Include one R dotplot, or violin plot.
  7. Use scale_color and/or scale_fill at least once.
  8. Include at least one faceted graph.
  9. Most of your graphs will probably be in ggplot.  But include at least one "creative" graph using Excel.  
  10. Include at least one tabular visualization.  This can also be a matrix in Power View, with drill down capabilities.
  11. Include one "one large number" visualization.  This can be done directly in PPT, or using a pie chart in Excel.
  12. For excellence points:  add your table(s) to a data model in Excel, and create graph(s) in Power View, Maps or 3D Maps.
  13. For excellence points:  use melt and/or reshape in R.
  14. Find some package in R that we didn't use in class.  Use it.

Some things I am looking for:
  1. Appropriate use of data.frames.
  2. Conversion to/from factors.
  3. Selection of rows and columns to extract the appropriate subset of data for graphing in R.
  4. Approrpiate selection of graph for your data type.
  5. Correct interpretation of the graph.
  6. Understanding of the effectiveness of the graph.
  7. Creativity.
  8. Effective use of color for the application and for your audience.
  9. Adherence to Tufte, Cleveland, etc. guidelines for effective visualization.
  10. I am interested in the process!!  You may show the same graphs through several iterations, and how you refined it to make it better.

  1. Your PPT presentation.
  2. A zip file with your entire R application.  
  3. Your Excel file.