Batch data generation

Within the simulator, one class is defined as the ‘batch’ and it can be used to simulate batch reactors with defined kinetics.

First, the package is loaded and available default examples are displayed using show_implemented_examples method.

[1]:
from insidapy.simulate.ode import batch
batch().show_implemented_examples()
[+] The following examples are implemented in this BATCH class:
+----------------------+----------------------------------------------------------------------------------+
|  Example ID string   | Description                                                                      |
+----------------------+----------------------------------------------------------------------------------+
|        batch1        | Batch fermentation with 3 species. Bacteria growth, substrate consumption and    |
|                      | product formation. Mimics the production of a target protein.                    |
|        batch2        | Batch fermentation with 4 species. Bacteria growth, nitrate/carbon/phosphate     |
|                      | consumption. Mimics a waste water treatment process.                             |
|        batch3        | Enzyme substrate interaction described by the Michaelis-Menten model. 4 species. |
|                      | E + S <-[k1],[ki1]-> ES ->[k2] E + P                                             |
|        batch4        | Series of reactions. 3 species. A -[k1]-> B -[k2]-> C.                           |
|        batch5        | Van de Vusse reaction. 4 species. A -[k1]-> B -[k2]-> C and 2 A -[k3]-> D.       |
|        batch6        | Batch fermentation with 7 species: Biomass (total, viable, death cells),         |
|                      | glucose, glutamine, lactate, and ammonia. Mimics CHO cell growth in reactor.     |
+----------------------+----------------------------------------------------------------------------------+


Next, one can decide to use one of the implemented examples or define a new one. In this case, the example of a batch fermentation with 3 species is used ('batch1'). Each example would be fully defined with default values. However, we can overwrite these values by using the corresponding inputs. If we overwrite the default values, we are reminded to check that the given values are in correct order.

[2]:
data = batch(   example='batch1',                                       # Choose example. Defaults to "batch1".
                nbatches=4,                                             # Number of batches. Defaults to 3.
                npoints_per_batch=20,                                   # Number of points per batch and per species. Defaults to 20.
                noise_mode='percentage',                                # Noise mode. Defaults to "percentage".
                noise_percentage=2.5,                                   # Noise percentage (in case mode is "percentage")
                random_seed=10,                                         # Random seed for reproducibility. Defaults to 0.
                bounds_initial_conditions=[[0.1, 50, 0], [0.4, 90, 0]], # Bounds for initial conditions. Defaults to "None".
                time_span=[0, 80],                                      # Time span for integration. Defaults to "None".
                initial_condition_generation_method='LHS',              # Method for generating initial conditions. Defaults to "LHS".
                name_of_time_vector='time')                             # Name of time vector. Defaults to "time".
[!] IMPORTANT: It seems that you changed the default bounds of the species. Make sure the order of the indicated bounds is the following: ['biomass', 'substrate', 'product']

We can now print some information about the example.

[3]:
data.print_info()
[+] Loaded the example BATCH1 with the following properties:
+--------------------------------+------------------------------------------------------------------------+
| Property                       | Description                                                            |
+--------------------------------+------------------------------------------------------------------------+
| Example string                 | batch1                                                                 |
| Example description            | Batch fermentation with 3 species. Bacteria growth, substrate          |
|                                | consumption and product formation. Mimics the production of a target   |
|                                | protein.                                                               |
| Short reference information    | ISBN 0-13-512966-4                                                     |
| Number of species              | 3                                                                      |
| Species names                  | ['biomass', 'substrate', 'product']                                    |
| Species units                  | ['g/L', 'g/L', 'g/L']                                                  |
| Number of batches              | 4                                                                      |
| Number of samples              | 20                                                                     |
| Time span                      | [0, 80]                                                                |
| Time unit                      | h                                                                      |
| Noise mode                     | percentage                                                             |
| Noise percentage               | 2.5%                                                                   |
| Lower bounds for experiments   | [0.1, 50.0, 0.0]                                                       |
| Upper bounds for experiments   | [0.4, 90.0, 0.0]                                                       |
+--------------------------------+------------------------------------------------------------------------+

This might be useful to use the example in a publication (use the short reference information given to reference where the example came from) or to check if the example is defined as intended.

After that, we can run the experiments to create some in-silico data using the run_experiments method. We can then for example check the data of the first experiment.

[4]:
data.run_experiments()
print(data.y_noisy[0])
[+] Experiments done.
         time   biomass  substrate    product
0    0.000000  0.209931  75.317747   0.066468
1    4.210526  0.412243  70.274789   0.024264
2    8.421053  0.977525  68.432262   0.902232
3   12.631579  1.328248  61.199162   1.619636
4   16.842105  1.456481  54.647587   3.486157
5   21.052632  2.304200  46.356629   5.382192
6   25.263158  3.096617  34.720784   6.968374
7   29.473684  3.642308  23.468698   8.556058
8   33.684211  4.521516  14.253261  10.275535
9   37.894737  5.027557   8.758139  11.159293
10  42.105263  5.280977   4.734118  12.172444
11  46.315789  5.185576   1.282952  12.280378
12  50.526316  5.597395   1.225398  12.236056
13  54.736842  5.538414   0.024317  12.101252
14  58.947368  5.521777   2.407662  12.260585
15  63.157895  5.454142  -0.574597  12.511703
16  67.368421  5.617953  -0.238538  12.145786
17  71.578947  5.642230  -0.296372  12.394362
18  75.789474  5.490113  -0.390336  12.776483
19  80.000000  5.552931   0.133594  12.458358

Most modeling approaches require a training dataset and a separate testing dataset. To generate separate datasets, the user can apply a splitting in an sklearn-manner. There is no default value set. In case the user calls the function, a test_splitratio in the range [0,1) needs to be chosen. The value represents the fraction of the total number of batches generated used for the test set. The data is then splitted and stored in the data object as data.training and data.testing.

[5]:
data.train_test_split(test_splitratio=0.2)

We can now plot the experiments. The fist way to do so is to plot all the runs using the plot_experiments method. The method lets us save the figure using a path (save_figure_directory), a name (figname) and an some extensions (save_figure_exensions) as a list. By using show=False, the plot will not be displayed in a running code.

[6]:
data.plot_experiments(  save=True,
                        show= False,
                        figname=f'{data.example}_simulated_batches',
                        save_figure_directory=r'.\figures',
                        save_figure_exensions=['png'])
[+] Saving figure:
        ->png: .\figures\batch1_simulated_batches.png
../_images/notebooks_main_batch_examples_11_1.png

We can also visualize the training and testing runs individually using the plot_train_test_experiments method.

[7]:
data.plot_train_test_experiments(   save=True,
                                    show=False,
                                    figname=f'{data.example}_simulated_batches_train_test',
                                    save_figure_directory=r'.\figures',
                                    save_figure_exensions=['png'])
[+] Saving figure:
        ->png: .\figures\batch1_simulated_batches_train_test.png
../_images/notebooks_main_batch_examples_13_1.png

After the simulation, one can export the data as XLSX files. By choosing which_dataset to be training (only executable if train_test_split was applied), testing (only executable if train_test_split was applied), or all (always executable), the corresponding data is exported to the indicated location:

[8]:
data.export_dict_data_to_excel(destination=r'.\data', which_dataset='all')      # Exports all the data
data.export_dict_data_to_excel(destination=r'.\data', which_dataset='training') # Exports the training data (blue circles in Fig 2)
data.export_dict_data_to_excel(destination=r'.\data', which_dataset='testing')  # Exports the training data (red diamonds in Fig 2)
[+] Exported batch data to excel.
        -> Dataset: ALL (options: training, testing, all)
        -> Noise free data to: .\data\batch_batch1_all_batchdata.xlsx
        -> Noisy data to: .\data\batch_batch1_all_batchdata_noisy.xlsx
[+] Exported batch data to excel.
        -> Dataset: TRAINING (options: training, testing, all)
        -> Noise free data to: .\data\batch_batch1_training_batchdata.xlsx
        -> Noisy data to: .\data\batch_batch1_training_batchdata_noisy.xlsx
[+] Exported batch data to excel.
        -> Dataset: TESTING (options: training, testing, all)
        -> Noise free data to: .\data\batch_batch1_testing_batchdata.xlsx
        -> Noisy data to: .\data\batch_batch1_testing_batchdata_noisy.xlsx