vd
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
vd [2021/08/06 13:19] – [Case Study: Exported CSV from PG&E] dblume | vd [2024/05/13 11:23] (current) – [Process Data] dblume | ||
---|---|---|---|
Line 25: | Line 25: | ||
| ; | Extract regex to new column. Ex, '' | | ; | Extract regex to new column. Ex, '' | ||
| %%^%% | rename the column. Might have to be " | | %%^%% | rename the column. Might have to be " | ||
+ | | = | Use Python function to create new column. Ex, hex to dec: '' | ||
+ | | : | Split column by regex | | ||
| - | Hide column | | | - | Hide column | | ||
| S | Go to " | | S | Go to " | ||
Line 34: | Line 36: | ||
| " | Open duplicate sheet with only selected rows | | | " | Open duplicate sheet with only selected rows | | ||
- | ==== Case Study: Exported CSV from PG& | + | ===== Inspecting Columnar Data ===== |
- | PG&E CSVs come with 5 rows of metadata followed by Type, Date, Start Time, End Time, Usage, Units, Cost, Notes columns. Delete the five rows of metadata in a text editor, or use '' | + | ^ Key ^ Meaning ^ |
+ | | I | Describe all columns, errors, distinct, mode, mean, median, stdev, etc. | | ||
+ | | i | Add a column of incrementing numbers (useful for '.' | ||
+ | | . | Requires an " | ||
+ | | O | Options to enable " | ||
+ | | F | Frequency table of row counts, or histogram if numeric_binning is true | | ||
- | tail +6 pge_electric_interval_data.csv | vd -f csv - | + | Calculating a percentage-of-total column for a numeric column: |
- | + | ||
- | Then prepare your PG&E data like so: | + | |
^ Key ^ Meaning ^ | ^ Key ^ Meaning ^ | ||
- | | - | Hide columns TYPE, END TIME, UNITS and NOTES | | + | | # | Set column type to " |
- | | C | Go to column mode and... | | + | | I | Describe all columns. (Highlight |
- | | t | Select | + | | ~ | Convert that column |
- | | & | Make a new column | + | | zy | Yank the value of the sum. | |
- | | q, - | Quit the Column mode, hide DATE and START TIME columns | | + | | q | Quit the Describe sheet. | |
- | | O | Go to options mode and... | | + | | = | New column. Enter '' |
- | | e | Set '' | + | ====== Case Study Link: Exported CSV from PG&E ====== |
- | | q | Quit options mode. | | + | |
- | | @!, %, $ | Set DATE_START_TIME to date format and important, USAGE to float, COST to currency | + | Visit [[vd-pge]]. |
- | | = | Add a column, enter '' | + | |
- | | %%^%% | Rename COST/ | + | ====== Cast Study: Merging Two Tables, logs and metadata ====== |
- | | . or g. | Select columns to graph them. Notice rate changes. Notice times of high use. | | + | |
- | | +, - | Navigate with hjkl, zoom in and out ([[https:// | + | |
==== Protip: Use column view to set multiple columns at once ==== | ==== Protip: Use column view to set multiple columns at once ==== | ||
Line 125: | Line 128: | ||
$ vd --play=my_cmdlog.vd --replay-wait=0.5 | $ vd --play=my_cmdlog.vd --replay-wait=0.5 | ||
+ | ====== Lists in Cells for Frequency Tables ====== | ||
+ | Sometimes you want one of the columns in a Frequency Table to be a list of unique values. Let's say the column title is " | ||
+ | |||
+ | ^ Key ^ Meaning ^ | ||
+ | | + | Set the aggregator to " | ||
+ | | F | Make a Frequency Table for the selected column. (gF for selected columns) | | ||
+ | | =, ',' |
vd.1628281188.txt.gz · Last modified: 2023/04/12 20:44 (external edit)