Appa Build in depth#

Find activity by search in exchanges#

EcoInvent datasets in Brightway are identified by a random code (uuid). To find the code of the desired dataset, one solution can be to use the Activity Browser to search for it manually.

electricity.yaml#
 1exchanges:
 2- database: user_database
 3  name: electricity
 4  type: technosphere
 5  amount: 1
 6  switch:
 7    name: usage_location
 8    options:
 9    - name: EU
10      input:
11        uuid: 6f3bff7be2bf2f55a7afd206b7512bfd
12        database: ecoinvent_3.9.1_cutoff
13    - name: FR
14      input:
15        uuid: 3463ba8aaa04e2999e8bc0361df969be
16        database: ecoinvent_3.9.1_cutoff

Since this can be tedious and error-prone, Appa Build allows to dynamically search for activities using regular expressions for the activity fields. The set of regexes should return exactly one activity. The following example is equivalent to the previous one. It looks for an activity that matches a regex for the name, and a strict match for the location.

electricity.yaml#
 1exchanges:
 2- database: user_database
 3  name: electricity
 4  type: technosphere
 5  amount: 1
 6  switch:
 7    name: usage_location
 8    options:
 9    - name: EU
10      input:
11        name: "market (group )?for electricity, low voltage"
12        location: "RER"
13        database: ecoinvent_3.9.1_cutoff
14    - name: FR
15      input:
16        name: "market (group )?for electricity, low voltage"
17        location: "FR"
18        database: ecoinvent_3.9.1_cutoff

Parameters propagation#

Performing a screening LCA of a low TRL technology may require the use of generic data sets that can be parameterized and/or modularized. This feature is typically available in most LCA software, including Brightway and lca_algebraic.

In Appa Build we build on this feature to include the notion of parameter propagation. In short, it allows to influence the parameterization of a downstream dataset in an upstream dataset. For example, a nvidia_gpu_chip dataset could:

  • Set the fab_location parameter used in a downstream wafer production dataset to a specific value.

  • Replace the area parameter used in a downstream die production dataset with a formula function of cuda_core.

  • Rename a leads parameter used in a downstream package production dataset to io.

The new parameters used when manipulating this nvidia_gpu_chip dataset could then be modified again by different FU datasets in the context of different LCA, in order to always have the more appropriate set of parameters in the impact models, which recursively affect the parameterization of all mobilized datasets. This mechanism allows to define very generic datasets at the leaves that are common to most of your LCAs, to progressively more specialized datasets the closer you get to the datasets that represent the different FUs of your LCAs.

Let’s look at the structure of the example LCA:

Diagram of the datasets and corresponding parameterization used in Appa Build to generate the example impact model.

The logic_wafer_manufacturing dataset is generic and is parameterized by two parameters: fab_location and masks.

logic_wafer_manufacturing.yaml#
 1parameters:
 2- fab_location
 3- masks
 4exchanges:
 5- database: user_database
 6  name: cmos_wafer_production
 7  type: technosphere
 8  switch:
 9    name: fab_location
10    options:
11    - name: TW
12      amount: "(0.049*masks + 0.3623) * 3.14159 * pow(15, 2)" #impact originally is per cm², and we want it per 300 mm wafer
13  input:
14    database: impact_proxies
15    uuid: "('EF v3.0', 'climate change', 'global warming potential (GWP100)')_technosphere_proxy"

It is used as input to the logic_die_manufacturing, which introduces a new area parameter in addition of the two parameters required by the wafer dataset.

logic_die_manufacturing.yaml#
 1parameters:
 2- area
 3- fab_location
 4- masks
 5exchanges:
 6- database: user_database
 7  name: cmos_die_production
 8  type: technosphere
 9  amount: "1/(300*3.14159*((300/(4*area))-(1/(sqrt(2*area)))))"
10  input:
11    database: user_database
12    uuid: logic_wafer_manufacturing

It is used as input to the functional_logic_die_manufacturing data set, which introduces the defect_density parameter to model the yield. It also takes advantage of the parameter matching feature to replace the masks parameter with a function of technology_node. As a result, to use the functional_logic_die_manufacturing, you don’t need the masks parameter anymore, but technology_node instead. In the context of embedded AI ecodesign, using technology node can be interesting because it is an information more easily accessible to embedded AI designers than the number of lithography masks, and the first can be used to estimate the second.

functional_logic_die_manufacturing.yaml#
 1parameters:
 2- area
 3- defect_density #defect/mm²
 4- fab_location
 5- technology_node
 6exchanges:
 7- database: user_database
 8  name: functional_cmos_die_production
 9  type: technosphere
10  amount: "1/pow((1-exp(-defect_density*area))/(defect_density*area), 2)"
11  parameters_matching:
12    masks: "137.24 * pow(technology_node, -0.317)"
13  input:
14    database: user_database
15    uuid: logic_die_manufacturing

It is used as input to the nvidia_gpu_die_manufacturing dataset. This dataset is highly specialized. Nvidia GPU dies of interest can be defined by their architecture and number of cuda_core. For the two architectures of interest, we can set the fab_location parameter to the TW value, and the technology_node and defect_density parameters. The area can also be reasonably estimated by the number of cuda_core for both architectures.

nvidia_gpu_die_manufacturing.yaml#
 1parameters:
 2- cuda_core
 3- architecture
 4exchanges:
 5- database: user_database
 6  name: logic_die
 7  type: technosphere
 8  amount: 1
 9  switch:
10    name: architecture
11    options:
12    - name: Pascal
13      parameters_matching:
14        defect_density: 0.05
15        technology_node: 16 #is actually 14 for 2 chips, and 16 for 4 chips.
16        fab_location:
17          TW: 1
18        area: 0.13184623155305694*cuda_core + 21.707425626610416
19    - name: Maxwell #also includes Maxwell 2.0
20      parameters_matching:
21        defect_density: 0.02
22        technology_node: 28
23        fab_location:
24          TW: 1
25        area: 0.1889809692866578*cuda_core + 19.47688243064738
26  input:
27    database: user_database
28    uuid: functional_logic_die_manufacturing

Now, the user will only has to specify the cuda_core and architecture parameters to parameterize the impacts of the embedded Nvidia AI accelerator manufacturing.

Note

It is important to note that the same dataset can be used in multiple exchanges with different parameterizations.

Rename a parameter

Parameter matching can be used to rename a parameter. Simply specify the new parameter name instead of a value or formula. Note that, to date, Enum parameters cannot be renamed.

Import data at the impact level#

Motivations#

Importing dataset at the impact level can be useful if only impact data are provided by the source and no LCI data, or if you are trying to import data from a database that is not supported by Brightway. In the second case, LCI data (i.e. data at the elementary flow level) exist but are difficult to connect to Brightway mainly due to the fact that elementary flows are not consistent between LCIA methods, LCI databases and LCA software. This issue has been discussed in details[1]. In a nutshell, elementary flows are named and classified differently in different sources. Matching them all together, and matching them with characterization factors is a task that cannot reasonably be done manually as there are hundreds, if not thousands of them[2]. A unified elementary flow list has been proposed[3], but has not been widely implemented yet. In the meantime, importing data at the impact level may be an interesting workaround.

Limits#

The first and most obvious limitation is that any valuable information in the data at the elementary flow level is lost. You wouldn’t be able to tell which greenhouse gas emission is responsible for the climate change impact of your data if you imported it at the impact level.

Second limitation is that since the impacts are calculated by different software than Brightway, there could be inconsistencies as different software cannot calculate the same impacts despite using the same LCIA methods[4].

The third limitation is that you must have the impact data calculated using the LCIA methods of your choice. If you intend to use EF v3.1 impacts in your LCA, you will need to find EF v3.1 impacts for all of your data imported at the impact level, and you will need to cover all the required indicators.

Warning

Be careful not to run your LCA with indicators that are not covered in all of your imported data at the impact level, or you will underestimate the impacts in your impact model. We plan to add a warning when building the impact model in this case.

How to do it manually with Appa Build?#

Appa Build generates a set of impact proxies at startup. Impact proxies are datasets used to generate impacts without communicating with real biosphere flows and characterization factors. One impact proxy is generated for each LCIA method. The name of the impact proxy is {bw_method_name}_technosphere_proxy where {bw_method_name} is the name of the LCIA method in Brightway. The list of available bw_method_name is as follows:

How is it implemented?

For each LCIA method, one biosphere dataset is created and linked to a characterization factor of one to the LCIA method, and also a technosphere dataset with one unit of the corresponding biosphere proxy as input exchange. Both are necessary as characterization factor needs to be connected with a biosphere dataset, and a technosphere dataset is necessary for using it as exchange in a foreground dataset.

An example of hwo to use the impact proxy is available in the electricity_no_ei.yaml sample dataset:

electricity_no_ei.yaml#
 1name: electricity_no_ei
 2location: GLO
 3type: process
 4unit: kwh
 5amount: 1
 6parameters:
 7- usage_location #Only FR or EU supported
 8comment: "Low voltage electricity using 2023 https://www.eea.europa.eu/ data."
 9exchanges:
10- database: user_database
11  name: electricity
12  type: technosphere
13  switch:
14    name: usage_location
15    options:
16    - name: EU
17      amount: 0.005
18      input:
19        database: impact_proxies
20        uuid: "('EF v3.0', 'climate change', 'global warming potential (GWP100)')_technosphere_proxy"
21    - name: FR
22      amount: 0.021
23      input:
24        database: impact_proxies
25        uuid: "('EF v3.0', 'climate change', 'global warming potential (GWP100)')_technosphere_proxy"

How to do it automatically with Appa Build?#

Writing each dataset manually can be tedious and error-prone. We have developed LCA software output parsers to automate the generation of impact level datasets. So far only the EIME v6 generator has been developed, but the work can be easily replicated.

The first step in importing EIME v6 data at the impact level is to use the software to export the desired datasets. To create the appropriate export file, create a new EIME V6 project, add a single phase (no specific phase name required), and add a copy of each dataset you want to export. Then go to the Analysis page, check all the required impact assessment methods, and export the result as a xlsx file. This export file will only contain the impacts of the dataset, but not the other required Activity fields. These fields must be included in an additional configuration file.

Common fields for all datasets can be defined once in a “default” dict. Specific fields can be provided as a list in a “datasets” dict. Each dataset should have a “name_in_export” key whose value is the name of the corresponding dataset in the EIME V6 export.

An example can be found in the samples/ directory. To run it with the CLI, you can use the following command:

appabuild database generate-eime-v6 samples/exports/sample_eime_v6_export.xlsx samples/conf/gen_sample_eime_v6.yaml outputs/sample_eime_v6_datasets/

samples/sample_eime_v6_export.xlsx contains fake PEF impacts for two mock datasets.
samples/gen_sample_eime_v6.yaml contains the remaining information. This configuration has the following structure:

gen_sample_eime_v6.yaml#
 1default: #(1)!
 2  database: user_database
 3  unit: unit
 4  type: "technosphere"
 5  comment: "Negaoctet mock sample dataset exported from Eime V6."
 6  amount: 1
 7datasets: #(2)!
 8  - name_in_export: "    Mock dataset RER" #(3)!
 9    location: RER
10    name: mock_dataset_RER
11    uuid: mock_dataset_RER
12  - name_in_export: "    Mock dataset FR"
13    location: FR
14    name: mock_dataset_FR
15    uuid : mock_dataset_FR
  1. Default fields are present in every dataset unless overridden.

  2. Fields specific to a dataset must be specified below the dataset list.

  3. Should match the name of the dataset in the export file. Don’t forget to include the spaces at the beginning!

Life cycle inventory graphs#

Life cycle inventory graphs (LCI graphs) make more visual the structure of life cycle inventory for a LCA.

Structure of a graph#

A LCI graph is a tree which displays all the downstream activities of the foreground database and corresponding exchanges. Each node being an activity and the root being the FU. Edges represent the exchanges, a label is attached to each edge showing the dynamic propagation of the parameters with parameters matching represented like a function call.

For example, the parameter matching in the dataset nvidia_ai_gpu_chip :

inference: inference_per_day * lifespan * 365.25

will be represented like this in an edge label :

inference=f(inference_per_day,lifespan)

Here is the graph generate using the samples datasets and using the dataset nvidia_ai_gpu_chip as the FU :

How to generate a graph ?#

To generate a graph, use the command:

appabuild lca graph PATH_TO_DATASETS NAME_OF_FU_DATASET OPTIONS

Attention

Graphs images are generated using a distant API, it is recommended to not use this command if you use sensitive data. Before starting to generate a graph, a confirmation prompt will appear.

Command options

  • --type VALUE: type of the output image, can only be png or svg, the default type is png.

  • --width VALUE: width of the output image, must be an integer value superior to 0.

  • --height VALUE: of the output image, must be an integer value superior to 0.

  • --no-sensitive: skip the confirmation prompt and start the graph generation directly.