Lab: Interactive Visualizations with Plotly
DSAN5200 - Spring 2024
Overview and dataset
For this lab, you will work with the dataset collected to produce the paper Summarizing CPU and GPU Design Trends with Product Data. There is a related website https://chip-dataset.vercel.app/ that shows some of this data graphically using interactive plots.
We will discuss initial data processing and visualization approaches in class, thinking about the added elements of interaction (hovering, mouse clicks, etc.) to improve the visualizations in the website using Plotly.
The dataset is a CSV file in the data/ directory which contains almost 5,000 data points collected over time. It includes all CPU and GPU products to the best of the authors’ knowledge and kept up to date.
| Field | Description (blank if self-explanatory) |
|---|---|
| Product | Manufacturer, series and model number |
| Type | CPU or GPU |
| Release Date | |
| Process Size (nm) | |
| TDP (W) | Thermal Desing Power |
| Die Size (mm^2) | |
| Transistors (million) | |
| Freq (GHz) | Base clock speed |
| Foundry | The chip fabricator |
| Vendor | Manufacturer |
| FP16 GFLOPS | Millions (giga) floating Point Operations per Second (FLOPS) at 16 bit precision |
| FP32 GFLOPS | Millions (giga) floating Point Operations per Second (FLOPS) at 32 bit precision |
| FP64 GFLOPS | Millions (giga) floating Point Operations per Second (FLOPS) at 64 bit precision |
Data considerations (from the paper)
- Many GPU products (for example, the NVIDIA Tesla K80) contain more than one GPU chip on a single board. This dataset only considers the properties for one of its chips.
- For device frequency, only the base clock frequency speed (in GHz) is considered.
- Energy consumption is estimated using TDP (Thermal Design Power).
- Transistor count for Intel CPUs produced since 2014 are missing since those are not provided.
Tasks
- Using the website as an inspiration, create four interactive Plotly plots (two with R, and two with Python) that help illustrate Moore’s Law and the differences in CPUs, GPUs. and vendors over the years.
Notes:
- You may only use the
plotly.graph_objectslibrary in and theplotlypackage in .plotly.expressmay not be used. - You must build the visualizations from scratch with Plotly. You may not generate Plotly from
matplotlib,seabornorggplotobjects. - You may choose to create new visualizations, or recreate a version of existings plot on the website, but you must make improvements. Some ideas:
- Modify the data (via transformations or other methods)
- Change the unit of analysis
- Select appropriate color encodings
- Use faceting as appropriate
- Add additional interactions
- You do not need to re-create the selection boxes, just focus on the visualizations
- Observable may not be used.
Remember:
- Your visualizations must:
- Have a descriptive caption (using the chunk option
fig-cap). - Be properly labeled, with axes labeled and units shown.
- Have any necessary annotations to help the viewer understand.
- Have a title.
- Use the themes you’ve developed.
- Have a descriptive caption (using the chunk option
- Separate your sections with a level 2 header (
## ...). - Make sure to use spell and grammar check.
Submission
The final deliverable must be a rendered Quarto HTML document called named
my_submission.htmlat the root level. The corresponding Quarto document must be calledmy_submission.qmd.Use the
code/directory for all your work (except the Quarto document.) Keep your code organized and commented. You may write aREADME.mdfile within thecode/subdirectory to provide additional insight and context about your work.You do not need to do everything in the single Quarto file at the root level. You can work on the visualizations, data processing, etc. in individual
RorPythonscripts, or in other Quarto documents in thecode/directory.When you render the Quarto file, you can include your other documents in the
code/directory in one of two ways:Including a script in a chunk by using a code chunk that looks like this:
```{r} #| echo: true #| eval: true #| file: code/myscript.R ```Including other Quarto documents (not scripts) by using the following notation:
{{< include code/name-of-qmd.qmd >}}
Rubric
| Component | Excellent | Satisfactory | Unsatisfactory |
|---|---|---|---|
| Breadth and Depth of Exploration | A sufficient number of follow-up questions were asked to yield insights that helped to more deeply explore the initial questions. | Some follow-up questions were asked, but they did not take the analysis much deeper than the initial questions. | No follow-up questions were asked after answering the initial questions. |
| Visualizations | The number of required visualizations were produced, and a variety of marks and encodings were explored. All design decisions were both expressive and effective. | The number of required visualizaitons were produced. The visual encodings chosen were largely effective and expressive, but some errors remain. | Several ineffective or inexpressive design choices are made. Fewer than the number of required visualizations have were created. |
| Captions | Captions richly describe the visualizations and contextualize the insight. | Captions do a good job describing the visualizations, but could be better contextualize the insight. | Captions are missing, overly brief, or shallow. |
| Expectations | You exceeded the parameters of the assignment, with original insights or a particularly engaging design. | You met all the parameters of the assignment. | You met some but not all of the parameters of the assignment. |