User Tools

Site Tools


vd

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
vd [2023/05/17 14:00] – [Inspecting Columnar Data] dblumevd [2026/01/14 13:03] (current) – [Calculating a percentage column for only specific columns] dblume
Line 45: Line 45:
 | F | Frequency table of row counts, or histogram if numeric_binning is true | | F | Frequency table of row counts, or histogram if numeric_binning is true |
  
-Calculating a percentage-of-total column for a numeric column:+====== Adding Percentage Columns ====== 
 + 
 +===== Calculating a percentage-of-global-total column =====
  
 ^ Key ^ Meaning ^ ^ Key ^ Meaning ^
Line 54: Line 56:
 | q | Quit the Describe sheet. | | q | Quit the Describe sheet. |
 | = | New column. Enter ''curcol/'', use Ctrl+y to paste the column sum value. | | = | New column. Enter ''curcol/'', use Ctrl+y to paste the column sum value. |
 +
 +===== Calculating a percentage column for only specific columns =====
 +
 +Say you've got a sheet with Date, Platform, Features, More, and Counts, and 
 +you want a Counts percentage of just the first two columns.
 +
 +This will require joining two sheets.
 +
 +^ Key ^ Meaning ^
 +|         | **First, make the subtotals for the first sheet...** |
 +| @ # ! - | Set column types, important columns, and hide unused columns |
 +| + sum   | Set aggregator to sum for the subtotal column |
 +| gF      | Make the frequency table. These sums will be the **dividend**. |
 +|         | **Get rid of the new counts column, remove importance of column for next percentage.** |
 +| ! - + sum | Remove importance, hide unused "counts" column, set aggregator of _sum column. |
 +| gF      | Make another Frequency table. These sums will be the **divisor**. |
 +|         | **Join the two Frequency tables.** |
 +| S t t & inner | Join the two tables from the Sheets sheet. |
 +| =Counts_sum*100.0/Counts_sum_sum | Make the new percentage column. |
 ====== Case Study Link: Exported CSV from PG&E ====== ====== Case Study Link: Exported CSV from PG&E ======
  
Line 128: Line 149:
   $ vd --play=my_cmdlog.vd --replay-wait=0.5   $ vd --play=my_cmdlog.vd --replay-wait=0.5
  
 +====== Lists in Cells for Frequency Tables ======
  
 +Sometimes you want one of the columns in a Frequency Table to be a list of unique values. Let's say the column title is "my_column", then:
 +
 +^ Key ^ Meaning ^
 +| + | Set the aggregator to "List" |
 +| F | Make a Frequency Table for the selected column. (gF for selected columns) |
 +| =, ','.join(set(my_column)) | Create a new column of a comma delimited Python Set of cell entries. |
vd.1684357223.txt.gz · Last modified: 2023/05/17 14:00 by dblume