data-analysis
Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| data-analysis [2023/04/15 00:43] – created dblume | data-analysis [2024/05/06 22:22] (current) – Added mention of set term block braille dblume | ||
|---|---|---|---|
| Line 25: | Line 25: | ||
| Here's an example command given the following two files, data.csv and gnuplot_instructions.gpi | Here's an example command given the following two files, data.csv and gnuplot_instructions.gpi | ||
| - | gnuplot -e " | + | gnuplot -e " |
| <file csv data.csv> | <file csv data.csv> | ||
| + | date, | ||
| 1992-01-01, | 1992-01-01, | ||
| 1992-02-01, | 1992-02-01, | ||
| Line 42: | Line 43: | ||
| # | # | ||
| set term dumb `tput cols` `tput lines`*9/10 | set term dumb `tput cols` `tput lines`*9/10 | ||
| + | # Or, if you have gnuplot 6.0 and are using Deja Vu font, then... | ||
| + | #set term block braille size `tput cols`,`tput lines`*9/10 | ||
| # | # | ||
| Line 58: | Line 61: | ||
| set xdata time | set xdata time | ||
| set xlabel ' | set xlabel ' | ||
| - | set ylabel ' | + | set xtics " |
| + | #set ylabel ' | ||
| # | # | ||
| Line 70: | Line 74: | ||
| # | # | ||
| set datafile sep ',' | set datafile sep ',' | ||
| + | set key autotitle columnhead | ||
| + | firstrow = system(' | ||
| + | set xlabel word(firstrow, | ||
| + | set ylabel word(firstrow, | ||
| # | # | ||
| Line 77: | Line 85: | ||
| # | # | ||
| #plot f using 1:4 with lines, f using 1:3 with linespoints | #plot f using 1:4 with lines, f using 1:3 with linespoints | ||
| - | #plot f using 1:2 with lines title t, f using 1:3 with linespoints title ' | + | #plot f using 1:2 with lines, f using 1:3 with linespoints title ' |
| - | plot f using 1:2 with linespoints | + | plot f using 1:2 with linespoints |
| + | </ | ||
| + | |||
| + | If you're making a " | ||
| + | <file bash gnuplot_instructions.gpi> | ||
| + | # Mostly the same as above, until... | ||
| + | |||
| + | # Set your X axis format | ||
| + | set style histogram clustered gap 1 | ||
| + | set style fill solid border -1 | ||
| + | # Finally, plot with boxes | ||
| + | plot f using 1:2 with boxes | ||
| </ | </ | ||
| Line 91: | Line 110: | ||
| * NumPy: Fundamental, | * NumPy: Fundamental, | ||
| * Matplotlib: Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. | * Matplotlib: Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy. | ||
| + | * Plotly: Generates interactive Javascript plots | ||
| Here's [[https:// | Here's [[https:// | ||
| Line 111: | Line 131: | ||
| **Get this**. NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, | **Get this**. NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, | ||
| + | |||
| + | ==== Plotly ==== | ||
| + | |||
| + | Undecided whether to use this. See [[https:// | ||
| ==== Matplotlib ==== | ==== Matplotlib ==== | ||
| Line 116: | Line 140: | ||
| **Get this**. Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy | **Get this**. Matplotlib is a plotting library for the Python programming language and its numerical mathematics extension NumPy | ||
| + | ====== Case Study: Temporal Series ====== | ||
| + | |||
| + | Data [[https:// | ||
| + | |||
| + | ===== VisiData ===== | ||
| + | |||
| + | vd AirPassenges.csv | ||
| + | |||
| + | ^ Key ^ Action ^ | ||
| + | | @ | Set Column one as date format | | ||
| + | | ! | Set Column one as " | ||
| + | | l | Navigate to column 2 | | ||
| + | | # | Set data as integer format | | ||
| + | | . | Create scatterplot | | ||
| + | |||
| + | {{: | ||
| + | |||
| + | **Pros**: Super fast and easy. | ||
| + | **Cons**: Need to use a font where Braille is supported. It's a scatterplot without lines. | ||
| + | |||
| + | ===== GnuPlot ===== | ||
| + | |||
| + | <file gnuplot AirPassengers.gpi> | ||
| + | # For ASCII on one full screen | ||
| + | #set term dumb `tput cols` `tput lines`*9/10 | ||
| + | |||
| + | # If you have gnuplot 6.0 and are using Deja Vu font, then... | ||
| + | #set term block braille size `tput cols`,`tput lines`*9/10 | ||
| + | |||
| + | # For a PNG file. | ||
| + | set term png size 900,400; set output ' | ||
| + | |||
| + | set timefmt ' | ||
| + | set xdata time | ||
| + | set format x ' | ||
| + | set key autotitle columnhead | ||
| + | set xlabel ' | ||
| + | set datafile sep ',' | ||
| + | |||
| + | # You can use: lines, points, linespoints | ||
| + | plot ' | ||
| + | </ | ||
| + | |||
| + | gnuplot AirPassengers.gpi && explorer.exe AirPassengers.png | ||
| + | |||
| + | {{: | ||
| + | |||
| + | When you change '' | ||
| + | |||
| + | < | ||
| + | 700 +-------------------------------------------------------------------------------+ | ||
| + | | + | ||
| + | | # | ||
| + | | * | | ||
| + | 600 |-+ *+-| | ||
| + | | * * * | | ||
| + | | | ||
| + | 500 |-+ | ||
| + | | ** * * * * | | ||
| + | | * | ||
| + | | * * ** * *** *| | ||
| + | 400 |-+ | ||
| + | | * * * * * ** * ** | ||
| + | | ** * * *** ***** *** | | ||
| + | | ** | ||
| + | 300 |-+ * * **** | ||
| + | | | ||
| + | | ** ** * *** **** * | | ||
| + | 200 |-+ * | ||
| + | | | ||
| + | | ** ** * *** ** | | ||
| + | |********* | ||
| + | 100 +-------------------------------------------------------------------------------+ | ||
| + | 1949 | ||
| + | Year | ||
| + | </ | ||
| + | |||
| + | |||
| + | |||
| + | **Pros**: Fast and easy. Render to text or png pretty easily. Sometimes better text renderings than VisiData. | ||
| + | **Cons**: Not that pretty without customizations. GPI file takes some tweaking. | ||
| + | |||
| + | ===== MatPlotLib ===== | ||
| + | |||
| + | <code python> | ||
| + | import pandas as pd | ||
| + | data = pd.read_csv(' | ||
| + | data[' | ||
| + | data = data.set_index([' | ||
| + | |||
| + | import matplotlib.pylab as plt | ||
| + | plt.figure(figsize=(10, | ||
| + | plt.xlabel(" | ||
| + | plt.ylabel(" | ||
| + | plt.plot(data) | ||
| + | plt.show() | ||
| + | </ | ||
| + | |||
| + | {{: | ||
| + | |||
| + | **Pros**: Theres [[https:// | ||
| + | **Cons**: Heavyweight. | ||
data-analysis.1681544624.txt.gz · Last modified: 2023/04/15 00:43 by dblume