Lab: Interactive Visualizations with Plotly

DSAN5200 - Spring 2024

Published

February 5, 2024

Overview and dataset

For this lab, you will work with the dataset collected to produce the paper Summarizing CPU and GPU Design Trends with Product Data. There is a related website https://chip-dataset.vercel.app/ that shows some of this data graphically using interactive plots.

We will discuss initial data processing and visualization approaches in class, thinking about the added elements of interaction (hovering, mouse clicks, etc.) to improve the visualizations in the website using Plotly.

The dataset is a CSV file in the data/ directory which contains almost 5,000 data points collected over time. It includes all CPU and GPU products to the best of the authors’ knowledge and kept up to date.

Field Description (blank if self-explanatory)
Product Manufacturer, series and model number
Type CPU or GPU
Release Date
Process Size (nm)
TDP (W) Thermal Desing Power
Die Size (mm^2)
Transistors (million)
Freq (GHz) Base clock speed
Foundry The chip fabricator
Vendor Manufacturer
FP16 GFLOPS Millions (giga) floating Point Operations per Second (FLOPS) at 16 bit precision
FP32 GFLOPS Millions (giga) floating Point Operations per Second (FLOPS) at 32 bit precision
FP64 GFLOPS Millions (giga) floating Point Operations per Second (FLOPS) at 64 bit precision

Data considerations (from the paper)

  • Many GPU products (for example, the NVIDIA Tesla K80) contain more than one GPU chip on a single board. This dataset only considers the properties for one of its chips.
  • For device frequency, only the base clock frequency speed (in GHz) is considered.
  • Energy consumption is estimated using TDP (Thermal Design Power).
  • Transistor count for Intel CPUs produced since 2014 are missing since those are not provided.

Tasks

  • Using the website as an inspiration, create four interactive Plotly plots (two with R, and two with Python) that help illustrate Moore’s Law and the differences in CPUs, GPUs. and vendors over the years.

Notes:

  • You may only use the plotly.graph_objects library in and the plotly package in . plotly.express may not be used.
  • You must build the visualizations from scratch with Plotly. You may not generate Plotly from matplotlib, seaborn or ggplot objects.
  • You may choose to create new visualizations, or recreate a version of existings plot on the website, but you must make improvements. Some ideas:
    • Modify the data (via transformations or other methods)
    • Change the unit of analysis
    • Select appropriate color encodings
    • Use faceting as appropriate
    • Add additional interactions
    • You do not need to re-create the selection boxes, just focus on the visualizations
  • Observable may not be used.

Remember:

  • Your visualizations must:
    • Have a descriptive caption (using the chunk option fig-cap).
    • Be properly labeled, with axes labeled and units shown.
    • Have any necessary annotations to help the viewer understand.
    • Have a title.
    • Use the themes you’ve developed.
  • Separate your sections with a level 2 header (## ...).
  • Make sure to use spell and grammar check.

Submission

  • The final deliverable must be a rendered Quarto HTML document called named my_submission.html at the root level. The corresponding Quarto document must be called my_submission.qmd.

  • Use the code/ directory for all your work (except the Quarto document.) Keep your code organized and commented. You may write a README.md file within the code/ subdirectory to provide additional insight and context about your work.

  • You do not need to do everything in the single Quarto file at the root level. You can work on the visualizations, data processing, etc. in individual R or Python scripts, or in other Quarto documents in the code/ directory.

  • When you render the Quarto file, you can include your other documents in the code/ directory in one of two ways:

    • Including a script in a chunk by using a code chunk that looks like this:

      ```{r}
      #| echo: true
      #| eval: true 
      #| file: code/myscript.R
      ```
    • Including other Quarto documents (not scripts) by using the following notation: {{< include code/name-of-qmd.qmd >}}

Rubric

Component Excellent Satisfactory Unsatisfactory
Breadth and Depth of Exploration A sufficient number of follow-up questions were asked to yield insights that helped to more deeply explore the initial questions. Some follow-up questions were asked, but they did not take the analysis much deeper than the initial questions. No follow-up questions were asked after answering the initial questions.
Visualizations The number of required visualizations were produced, and a variety of marks and encodings were explored. All design decisions were both expressive and effective. The number of required visualizaitons were produced. The visual encodings chosen were largely effective and expressive, but some errors remain. Several ineffective or inexpressive design choices are made. Fewer than the number of required visualizations have were created.
Captions Captions richly describe the visualizations and contextualize the insight. Captions do a good job describing the visualizations, but could be better contextualize the insight. Captions are missing, overly brief, or shallow.
Expectations You exceeded the parameters of the assignment, with original insights or a particularly engaging design. You met all the parameters of the assignment. You met some but not all of the parameters of the assignment.