Plastering is a unified framework for normalization of buildings metadata. Different frameworks can be unified into a workflow and/or compared with each other in Plastering.
Getting Started
Installation
- Install MongoDB: instruction
- Install Dependencies:
pip install -r requirements.txt - Install Plastering package:
python setup.py install Download dataset here. This link is not public yet. You may use synthesized data to test the algorithms for now.Unfortunately, UCSD does not approve publicly sharing the data. We may have a procedure to sign an agreement, but it's still under development. Until then please refer to a synthesized data as specified in an example.
Example with synthesized data.
- Load data:
python examples/tutorial/load_data.py - Run Zodiac:
python examples/tutorial/zodiac_tutorial.py- This will print out accuracy (F1 scores) step by step.
Example with SDH data
- Load ata:
python examples/tutorial/load_data_sdh.py - Run Scrabble:
python examples/tutorial/scrabble_tutorial.py- This produces
scrabble_output.ttl. - There will be an update about how to produce other types of results (metrics, other files, etc.)
- This produces
Example with synthesized data and Active Partial Labelling
- Load data:
python examples/tutorial/load_data.py - Run Active Partial Labelling:
python examples/tutorial/Active_Partial_Tutorial.py
Other examples
- Run Zodiac test:
python test_zodiac.py - Run Workflow test:
python test_workflow.py - Run Zodiac experiments:
python scripts/exp_zodiac.py ap_m - Produce figures:
python scripts/result_drawer.py
Speficiation
Data Format
Raw Metadata
- It is defined as
RawMetadatainsideplastering/metadata_interface.py. - Every BMS point is associated with a unique source identifier (srcid) and a building name.
- All BMS metadata is in the form of JSON document. A BMS point corresponds to a row with metadata possibly in multiple entries. Example:
{ "srcid": "123-456", "VendorGivenName": "RM-101.ZNT", "BACnetName": "VMA101 Zone Temp", "BACnetUnit": 64 }
Ground Truth of Metadata (LabeledMetadata)
- It is defined as
LabeledMetadatainsideplastering/metadata_interface.py. - tagsets: Any TagSets associated with the point.
- point_tagset: Point TagSet among the associated TagSets. If it's not defined, one may select Point-related TagSets from tagsets.
- fullparsing: Each entry has parsing results. An example for
123-456's VendorGivenName:- Tokenization:
["RM", "-", "101", ".", "ZN", "T"] - Token Labels:
["Room", None, "leftidentifier", None, "Zone", "Temperature"] - (Though Plastering by default supports the above token-label sets, different tokenization rules may apply from a framework. For example, one may want to use
ZNT -> Zone_Temperature_Sensorinstead. Such combinations can be extended later.)
- Tokenization:
- One may use a part of different label types or add a new label type if needed.
Raw Timeseries Data
- Every BMS point may produce a timeseries data associated with the corresponding srcid.
- Its data format is TODO.
Output Metadata in Brick
- Brick graph: Result graph in Brick (in Turtle syntax).
ex:RM_101_ZNT rdf:type brick:Zone_Temperature_Sensor . ex:RM_101 rdf:type brick:Room . ex:RM_101_ZNT bf:hasLocation ex:RM_101 .
- Confidences: A map of confidence of triples.
- A key is a triple in string and the value is its confidence. If the triple is given by the user, it should be 1.0. E.g.,
{ ("ex:RM_101_ZNT", "rdf:type", "brick:Zone_Temperature_Sensor"): 0.9, .. }
- A key is a triple in string and the value is its confidence. If the triple is given by the user, it should be 1.0. E.g.,
Framework Interface
- Each framework should be instantiated as the common interface.
Common Procedure
- Prepare the data in MongoDB. Example:
data_init.py - The number of seed samples are given and a framework is initialized with the number as well as the other configurations, which depend on the framework as different framework may require different initial inputs.
conf = { 'source_buildings': ['ebu3b'], 'source_samples_list': [5], 'logger_postfix': 'test1', 'seed_num': 5} target_building = 'ap_m' scrabble = ScrabbleInterface(target_building, conf)
- Start learning the entire building's metadata with the instance.
Each step inside
scrabble.learn_auto() # This function name may change in the near future.
learn_auto()looks like this:- Pick most informative samples in the target building.
# this code is different from acutal Scrabble code as it internally contains all the process. new_srcids = self.scrabble.select_informative_samples(10)
- Update the model
self.update_model(new_srcids)
- Infer with the update model
pred = self.scrabble.predict(self.target_srcids)
- Store the current model's performance.
- Pick most informative samples in the target building.
Workflow
- Each framework aligned to the interface (
./plastering/inferencers/inferencer.py) can be a part, called Inferencer, of a workflow to Brickify a building. - Workflow/Inferencer interface is defined under TODO.
- Workflow usage scenario:
- Each part is initiated with the raw data for target buildings in the format described in Data Format.
- In each iteration, each part runs algorithm sequentially.
- In each part, it receieves the result from the previous part and samples from an expert if necessary.
Benchmark
- Plastering also can be used to benchamrk different algorithms. It defines the common dataset and interactions with the expert providing learning samples.
- Benchmark usage scenario.
Examples
-
Initialize data
python data_init.py -b ap_m
-
Test with Zodiac