These scripts are for simple plotting of data. It uses the holoviews plotting libraries for plotting.
How to use
First you need to install a Anaconda python 3.x version on your computer. After doing that you can install the conda envrionement by.
conda env create -f requirements_<yourSystem>.yml
Next step is to activate the new build environement:
conda activate PlotScripts
Now it should be possible to run the scripts. This you can do by:
python myplot.py --config <pathtoConfigfile>
This should initiate the plotting script and when finished, should reward you with a nice html plot.
There are several possible args you can pass,
| Arg | Type | mandatory | description |
|---|---|---|---|
| --config, --file, --c, --conf | path/str | yes | The path to the config file |
| --dont_show | bool | no | If False the script will not show any html plots, default is True |
| --save, --s | bool | no | Bool if you want to save the plots or not, default is False |
The Config file
In order to plot anything you need to have a Config file. These files are YAML styled files. Some examples can be found in the CONFIGS folder. In principle such a file looks like this:
---
Files: # The files which are plotted together
- MyMeasurementfilePath.txt
Filetype: ASCII # What kind of type is my file, other options are CSV, JSON, None, CUSTOM customizations
# If None is passed, the script tries to read in the files based on their file suffix. .txt, .dat, will be interpreted as ASCII
Output: myplot #Output folder path for my plots
backend: bokeh # Choose the backend for the plotting. Warning: Output may change with different backends. Possible options: bokeh, matplotlib, plotly
Save_as: # save the plots in different data formats, if more than one is specified all of them will be plotted
- html
- png
- svg
- xml # Only works if the entry "xml_template_path" points to a correct template!
- json
xml_template_path: ".\\CONFIGS\\CMSxmlTemplate.yml" # Path to the XML template config file
Analysis:
- myAnalysisPlugin # The analysis Plugin over which the data will be run. These plugins must be located in the foler "analysis_scripts"
Poolsize: 4 # Maximum pool size of simultaneous analysis scripts
# Optional Parameters
ASCII_file_specs: # The specifications for the ascii file type measurements files
header_lines: 8 # The endline for the header
measurement_description: 9 # The line the measurements/column names are stated
units_line: 9 # The line the units for each measurement name are stated (can be the same as measurement_desciption)
data_start: 10 # The line the data starts
# If CUSTOM was choosen in 'Filetype', then you must define this custom specs section to tell where to find the file etc
Custom_specs: # Optional: Only needed if you have a custom importer --> As a Filetype the entry CUSTOM has to be passed!
path: # The path to your python file where the custom importer is
module: foo # The module name
name: bar # The function name inside you want to load
parameters: # Additional parameters your importer needs. Do not use if you dont need ones
param1: 1
param2: "Hello"
Measurement_aliases: # Internal renaming of the columns from the input files. Can be used to change names to be compliant with other analyses.
Your_column_name: the_alias
Your_column_name2: the_alias2
# Options for the different Analyses scripts
# These options are entirely up to you and what you need in your analysis
myAnalysisPlugin: This must match at least one of your measurement analysis plugins
General: # Options common to all plots
fontsize: {'title': 28, 'labels': 24, 'xticks': 24, 'yticks': 24, 'legend': 11}
responsive: False
width: 1200
height: 700
shared_axes: False
Layout: # these must the the names of the methods understood by holoviews!!! and a valid parameters
cols: 2 # How many columns there are in the final output plot
DoSpecialPlots: # Whether or not to do the SpecialPlot, it may not succeed if not at least one measurement has this special plot stated
- BoxWhisker
- Violin
- concatHistogram
- Histogram
# Options for SpecialPlots, the suffix must always be "Options"
BoxWhiskerOptions:
shared_axes: False
box_alpha: 0.3
width: 1200
height: 900
# Measurements
MyFirstMeasurement: # The name of a measurement, as stated in the measurement file
PlotLabel: Cool Label for my Plot
PlotStyles: # How to Plot the raw data, all holoviews plots are supported, if you pass the correct PlotOptions.
- Scatter
UnitConversion: nano # The values for this measurement will be converted to this order of magnitude
AdditionalPlots: # Which AdditionalPlots should be made.
- BoxWhisker
PlotOptions: # These options will directly be passed to the renderer, make sure they are valid. Look into holoviews, what options are supported
logy: True
logx: False
invert_xaxis: False
invert_yaxis: false
#ylim: !!python/tuple [0, 10e-6]
#xlim: !!python/tuple [0, 1000]
legend_position: "bottom_right"
#aspect: equal
padding: !!python/tuple [0, 0.1]
show_grid: True
gridstyle:
grid_line_color: black
grid_line_width: 1.5
minor_xgrid_line_color: lightgray
minor_ygrid_line_color: lightgray
xgrid_line_dash: [4, 4]
ygrid_line_dash: [10, 4]
#xlabel: voltage [V]
#ylabel: current [A]
shared_axes: False # If the axes should be shared with other plots, usually it is False
xformatter: "%.0f"
yformatter: "%.0f"
ToolsOptions: # These options are for the PlotScripts tool box, or for the self written customizations
yaxisENG: True # If you want to plot the y axis in engineering representation
Custom importer
If the given ASCII, JSON or CSV importer does not fit your needs you can write your own importer to import your data. For that you have to specify the Filetype entry in the configs as 'CUSTOM' and define a 'Custom_specs' section as seen in the example config. Inside your python script a function named after the entry name in the Custom_specs section must be there. This function gets 1 positional argument which is a list of filepathes to the choosen files. And all parameters as kwargs specified in the 'parameters' entry in the Custom_specs section. If you do not need any extra parameters you can delete the parameters entry in your config.
After parsing your data, the framework wants as a return a dict. The top level keys must be a kind of representation of your files (I use the filename). The values to this keys are again dicts with keys beeing the columns/data sets inside like voltage, capacitance etc. As values it can be any iterable object. But I would recommend a list or a numpy array. An example for a custom importer is included in the repo!
Plotting backend
PlotScripts is build on holoviews, and can plot with different plotting backends. The standard backend is bokeh. But you can choose another backend if you want with the parameter "backend". The options are bokeh, matplotlib and plotly.
Depending on the capabilities of the plotting backend, some plotting options may not be present in all backends. Therefore, the output may change!
The automatic preview will always be done via the bokeh backend. If you want to suppress this you have to pass the argument --dont_show.
The measurements files
If you have json or YAML files you do not need special treatment here, just state it correctly in the config file.
If you have ASCII styled files you need to include the parameters
ASCII_file_specs: # The specifications for the ascii file type measurements files
header_lines: 18
measurement_description: 20
units_line: 21
data_start: 22
in you config file. The sub-parameters:
- header_lines defines the length of the header in lines.
- measurement_description the line with the name of the measurements
- units_line the line where the units a described (can be the same line as measurement_description) the only thing this line need to have is something like curr [A], curr[A] or something like this, so the cript can find the units
- data_start is the starting line of the data, which can be separated by tabs, whitespace or commas
If you have CSV styled files you need to include the parameters
CSV_file_specs: # The specifications for the ascii file type measurements files
measurements:
- Measurement 1
- Measurement 2
units:
- Unit1
- Unit2
With the "units" entry you can define untis for your measurements, otherwise no units will be used.
If you follow this rules the script should be able to interpret your data.
Optional parameters:
Further customization can be done via the optional parameters inside the "ASCII_file_specs" entry:
ASCII_file_specs: # The specifications for the ascii file type measurements files
header_lines: 18
measurement_description: 20
units_line: 21
data_start: 22
data_separator: ";"
measurement_regex: ""
units_regex: ""
measurements:
- voltage
- current
units:
- V
- A
- data_separator: Define your own data separator if the data is not separated by a whitespace character
- measurement_regex: If the build in measurement regex, does not yield the correct measurements, here you can define your own regex for that
- units_regex: If the build in units regex, does not yield the correct units, here you can define your own regex for that
- measurements: Define a list of measurement names, which describe your data (you can use this if the regex totally fails or if you do not have such a header.)
- units: Define a list of units, which describe your data (you can use this if the regex totally fails or if you do not have such a header.)
Warning: Since '\' is an escape character in python you have to escape this character by typing '\\' instead of one. Otherwise the regex will fail.
Example ASCII file
A readable ASCII file (with the above config) would be:
# Measurement file: # Project: HPK 6 inch 2018 # Sensor Type: 2S # ID: VPX28442_11_2S # Operator: Dominic # Date: Wed Feb 27 08:48:27 2019 # implant_width: 22 # implant_length: 0 # metal_width: 32 # Campaign: Hamamatsu 6inch 2S # Creator: Dominic Bloech 03.12.2018 # type: p-type # pitch: 90 # metal_length: 49504 # thickness: 240 Pad Istrip Rpoly Idark Cac # current[A] res[Ohm] current[A] cap[F] 1.0 -- -- -- -- 2.0 -1.66724566667e-10 1895982.8106 -2.31596666667e-07 1.48568333333e-10 3.0 -1.599834e-10 1889915.74685 -2.30320333333e-07 1.48593666667e-10 4.0 -1.48145666667e-10 1892659.50326 -2.29964666667e-07 1.48617666667e-10
XML template
Here the principal XML template structure will be explained. The template needs at least the entry "Template" and it must be a dict.
After that you can write the principal structure for the xml file as a yml representation.
All values enclosed by <...> are the search parameter the script is searching for in the header. \
Example:
XML template: LOCATION: <Location>
Header: Locaction: HEPHY
xml file output: <LOCATION>HEPHY</LOCATION>All that is enclosed in //...// is a cloneable template entry. A corresponding template must be present in the config
As seen in the config file the template can be structured as
DATA_DUMP_template:
Idark:
STRIP: <Pad>
CURRNT_NAMPR: <Istrip>
TEMP_DEGC: <Temperature>
RH_PRCNT: <Humidity>
BIASCURRNT_NAMPR: <Idark>everything enclosed in <...> are the column names from the file. Inserted in the final file will then be the iterator values for this meausrement
All that is enclosed by "[...]" is a external script call. The key enclosed must have a matching key in the top level level of the yaml file. As a value must be a pointer to a valid python file which will then be executed. The output is captured and the inserted as a value into the final xml. If the output must be parsed, the same key with the prefix "_regex" can also be added. This regex will then be used to parse the output, and set as value in the xml, eventually.
---
Settings_name: CMSxmlTemplate
DB_uploader_API_module: "C:\\GitRepos\\cmsdbldr\\DB_loader.py" # The directory, where the db uploader is located
DB_downloader_API_module: "C:\\GitRepos\\cmsdbldr\\DB_grap.py --param some_params" # The directory, where the db downloader is located
DB_downloader_API_module_regex: "RUN\\s+NUMBER\\s+(.*)"
Template:
HEADER:
TYPE:
EXTENSION_TABLE_NAME: <EXTENSION_TABLE_NAME>
NAME: <NAME>
RUN:
RUN_TYPE: <Project> # Mandatory: ??? > IS Test Measurement
RUN_NUMBER: "[DB_downloader_API_module]"
LOCATION: <Location> # HEPHY
INITIATED_BY_USER: <Operator> # The Operator of this measurement
RUN_BEGIN_TIMESTAMP: <Date> # Optional but good to have
RUN_END_TIMESTAMP: <ENDTIME> # Optional
COMMENT_DESCRIPTION: <Comment> # Optional
DATA_SET:
COMMENT_DESCRIPTION: <DATA_COMMENT> # Optional
VERSION: <VERSION> # The data version? How many times I started the measurement?
PART:
KIND_OF_PART: <Sensor Type> # Hamamatsu 2S Sensor
BARCODE: <ID> # HPK_VPX28441_1002_2S
DATA: //DATA_DUMP_template//
File_specific_header:
Istrip:
HEADER:
TYPE:
EXTENSION_TABLE_NAME: TEST_SENSOR_IS
NAME: TrackerStrip-Sensor IS Test
RUN:
RUN_TYPE: IS Test Measurements
DATA_DUMP_template:
Idark:
STRIP: <Pad>
CURRNT_NAMPR: <Istrip>
TEMP_DEGC: <Temperature>
RH_PRCNT: <Humidity>
BIASCURRNT_NAMPR: <Idark>The xml template stated here will output a XML for the given file above and this xml template will read:
<?xml version="1.0" ?> <root> <HEADER> <TYPE> <EXTENSION_TABLE_NAME>TEST_SENSOR_IS</EXTENSION_TABLE_NAME> <NAME>TrackerStrip-Sensor IS Test</NAME> </TYPE> <RUN> <RUN_TYPE>IS Test Measurements</RUN_TYPE> <RUN_NUMBER>98765</RUN_NUMBER> <LOCATION>None</LOCATION> <INITIATED_BY_USER>Dominic</INITIATED_BY_USER> <RUN_BEGIN_TIMESTAMP>Wed Feb 27 08:48:27 2019</RUN_BEGIN_TIMESTAMP> <RUN_END_TIMESTAMP>None</RUN_END_TIMESTAMP> <COMMENT_DESCRIPTION>None</COMMENT_DESCRIPTION> </RUN> </HEADER> <DATA_SET> <COMMENT_DESCRIPTION>None</COMMENT_DESCRIPTION> <VERSION>None</VERSION> <PART> <KIND_OF_PART>2S</KIND_OF_PART> <BARCODE>VPX28442_11_2S</BARCODE> </PART> <DATA> <STRIP>1.0</STRIP> <CURRNT_NAMPR>nan</CURRNT_NAMPR> <TEMP_DEGC>21.899999618530273</TEMP_DEGC> <RH_PRCNT>23.700000762939453</RH_PRCNT> <BIASCURRNT_NAMPR>nan</BIASCURRNT_NAMPR> </DATA> <DATA> <STRIP>2.0</STRIP> <CURRNT_NAMPR>166.7245647096749</CURRNT_NAMPR> <TEMP_DEGC>21.899999618530273</TEMP_DEGC> <RH_PRCNT>23.799999237060547</RH_PRCNT> <BIASCURRNT_NAMPR>231.59667250638446</BIASCURRNT_NAMPR> </DATA> <DATA> <STRIP>3.0</STRIP> <CURRNT_NAMPR>159.9834015264534</CURRNT_NAMPR> <TEMP_DEGC>21.899999618530273</TEMP_DEGC> <RH_PRCNT>23.899999618530273</RH_PRCNT> <BIASCURRNT_NAMPR>230.3203388009933</BIASCURRNT_NAMPR> </DATA> <DATA> <STRIP>4.0</STRIP> <CURRNT_NAMPR>148.14566240417548</CURRNT_NAMPR> <TEMP_DEGC>21.899999618530273</TEMP_DEGC> <RH_PRCNT>24.100000381469727</RH_PRCNT> <BIASCURRNT_NAMPR>229.96466952918124</BIASCURRNT_NAMPR> </DATA>
The measurement plugins
The measurement plugins located in the analysis_scripts folder need to be python classes. Form the main they are getting passed the data, and the config dictionaries
- The data dict: This dictionary is containing all the data from the files with the key beeing the base name of the files from the config. Inside each entry is again a dictionary with the keys:
- data - a Dict with all measurements and each entry containing a ndarray with the data
- header - a list of str containing the header from the file
- measurements - a list of all measurements, in the order of the input file
- units - a list containinf all units fot each measurement
- analysed - a bool value (for your use)
- plots - a bool value (for your use)
- The configs dict: This dictionary is a 1 to 1 representation of your config file
The basic structure of analysis plugins
In principle you can do whatever you want from this point on, but I have written some cool tools for plotting which will help you create some cool plots in no time with little effort. The structure I am plotting is as follows:
class IVCV:
def __init__(self, data, configs):
self.log = logging.getLogger(__name__)
self.data = convert_to_df(data, abs=True) # Converts the data to pandas dataframes, and the optional parameter "abs" will return only the abs value for each measurement.
self.config = configs
self.df = []
self.basePlots = None
self.PlotDict = {}
# Convert the units to the desired ones
for meas in self.measurements:
unit = self.config["IVCV_QTC"].get(meas, {}).get("UnitConversion", None)
if unit:
self.data = convert_to_EngUnits(self.data, meas, unit)
hv.renderer('bokeh')
def run(self):
"""Runs the script"""
# Plot all Measurements
self.basePlots = plot_all_measurements(self.data, self.config, "voltage", "IVCV_QTC", do_not_plot=("voltage"))
self.PlotDict["BasePlots"] = self.basePlots
self.PlotDict["All"] = self.basePlots
# Whiskers Plot
self.WhiskerPlots = dospecialPlots(self.data, self.config, "IVCV_QTC", "BoxWhisker", self.measurements)
if self.WhiskerPlots:
self.PlotDict["Whiskers"] = self.WhiskerPlots
self.PlotDict["All"] = self.PlotDict["All"] + self.WhiskerPlots # This is how you add plots together in holoviews
# Reconfig the plots to be sure
self.PlotDict["All"] = config_layout(self.PlotDict["All"], **self.config["IVCV_QTC"].get("Layout", {}))
return self.PlotDict
**
As a return the framework wants a least a dictionary with the entry "All", in which all plots are combined, if this is not there, no plots will be shown and saving cannot be done.
Tools and other things
The framework has lots of subroutines which can simplify your workflow. These are located in the folders "forge". It tried to give every function a Docstring but sometimes I am lacy but you will figure out what it does.
- The utilities script gives you some basic non plot specific functions, normally you will not need them
- The tools script gives you tools how to plot or manipulate data
- The specialPlots script includes all special plot scripts, like violin, Histogram etc. you can add some as well
The most important functions are:
- forge.tools.plot_all_measurements - These function plots you all measurements
- forge.tools.convert_to_EngUnits - Converts the df entries to the specified order of magnitude
- forge.specialPlots.dospecialPlots - These function plots you all measurements, in the desired special plot style