General d-dimensional (d-variable) phase diagram

Here, the most common example is investigating a \(d\) dimensional (\(d\) variable) phase diagram. In this example, we use an array as input and output the next experimental point on standard output.

Suggest one experimental point to be tested

First, pdc_sampler in AIPHAD is imported. In addition, numpy must be imported as follows.

from aiphad import pdc_sampler
import numpy as np
  1. Specify phase diagram estimation method, uncertainty score, number of proposals

  • estimation : Phase diagram estimation method is specified. “LP”, “LS” can be selected.

  • sampling : Uncertainty score is specified. “LC”, “MS”, “EA”, “RS” can be selected.

  • proposal : Number of proposals is specified.

If “LP” is used as estimation method, “LC” is used as uncertainty score, and the number of proposals is 1, the python code is written as follows.

pdc  = pdc_sampler(estimation = "LP", sampling = "LC", proposal = 1)

All options for input for more input options.

  1. Specify input arrays

  • X : All candidate points in the discretized phase diagram are described. It can handle arbitrary dimensions.

  • y : The phase index is putted in order from 0 at the point where the phase is already identified. For the points where the phase is not identified, the phase index -1 is specified.

For example, the imput arrays are specified as follows.

X = np.array([
        [0.00,0.00,0.00],
        [0.00,1.00,0.00],
        [1.00,0.00,0.00],
        [1.00,1.00,0.00],
        [0.00,0.00,1.00],
        [0.00,1.00,1.00],
        [1.00,0.00,1.00],
        [1.00,1.00,1.00],
        ])

y = np.array([-1, -1, 0, 1, 2, -1, -1, 2])
  1. Perform phase diagram estimation

By using the pdc.fit() function as follows, the phase diagram is estimated by the label propagation method for the input arrays.

pdc.fit(X,y)
  1. Calculate uncertainty score

By using the pdc.us() function as follows, the calculation of the uncertainty score is performed.

pdc.us()
  1. Output next candidate points to be tested

The next candidate experiments are suggested according to the calculated uncertainty score. Candidates for the next experiment can be obtained as a list by executing the following function.

  • pdc.proposals : As a candidate for the next experiment, we can get the index of the proposed point relative to X.

  • pdc.proposals_X : As a candidate for the next experiment, we can get the position of the proposed point (i.e. X).

  • pdc.proposals_us : We can get the uncertainty score of each selected experimental point.

By specifying as follows, we can get these information as standard output.

proposals_index = pdc.proposals
proposals_X = pdc.proposals_X
proposals_us = pdc.proposals_us

print("proposal_index:", proposals_index)
print("proposal_X:", proposals_X)
print("proposal_us:", proposals_us)

The following result is output. The next experimental point is suggested as [0.0, 0.0, 0.0].

proposal_index: [0]
proposal_X: [[0.0, 0.0, 0.0]]
proposal_us: [0.5]
  1. Output candidate points with the high belonging probability to each phase

If there is an desired phase, AIPHAD can output candidates with a high belonging probability to each phase. Note that this is not a proposal for investigating a detailed phase diagram.

  • pdc.belonging_index : We can get the index for X of the candidate point with high probability of belonging to each phase.

  • pdc.proposals_X : We can get the position (input as X) of the candidate point with high probability of belonging to each phase.

  • pdc.proposals_us : We can get the belonging probability of candidate points with high belonging probability of each phase.

By specifying as follows, we can get these information as standard output.

belonging_index = pdc.belonging_index
belonging_X = pdc.belonging_X
belonging_probability = pdc.belonging_probability

for key, value in pdc.phase_id_dict.items():
   print("belonging_index_{}:".format(key), belonging_index[value])
   print("belonging_X_{}:".format(key), belonging_X[value])
   print("belonging_probability_{}:".format(key), belonging_probability[value])

The following result is output. It shows that the point with the highest belonging probability for label 0 is [0.0, 0.0, 0.0].

belonging_index_0: [0]
belonging_X_0: [[0.0, 0.0, 0.0]]
belonging_probability_0: [0.5]
belonging_index_1: [1]
belonging_X_1: [[0.0, 1.0, 0.0]]
belonging_probability_1: [1.0]
belonging_index_2: [5]
belonging_X_2: [[0.0, 1.0, 1.0]]
belonging_probability_2: [1.0]
  1. Output all uncertainty scores and probabilities

We can obtain the uncertainty scores and belonging probabilities of all points. Using these information, we can implement the original uncertainty score.

  • pdc.unlabeled_index_list : Indexes of points where the phase is not identified are stored.

  • pdc.u_score_list : Uncertainty scores of points where the phase is not identified are stored.

  • pdc.label_distributions : Belonging probabilities of points where the phase is not identified are stored. They are output in order of phase index.

By specifying as follows, we can get these information as standard output.

unlabeled_index = pdc.unlabeled_index_list
unlabeled_us = pdc.u_score_list
unlabeled_probabilities = pdc.label_distributions

for i in range(len(unlabeled_index)):
    print("unlabeled_index :", unlabeled_index[i])
    print("unlabeled_us_score :", unlabeled_us[i])
    print("unlabeled_probabilities :", unlabeled_probabilities[i])

The following result is output.

unlabeled_index : 0
unlabeled_us_score : 0.5
unlabeled_probabilities : [5.00000000e-01 2.70727708e-35 5.00000000e-01]
unlabeled_index : 1
unlabeled_us_score : 0.0
unlabeled_probabilities : [2.70727708e-35 1.00000000e+00 6.31697986e-35]
unlabeled_index : 5
unlabeled_us_score : 0.0
unlabeled_probabilities : [7.60080658e-70 2.70727708e-35 1.00000000e+00]
unlabeled_index : 6
unlabeled_us_score : 0.33333333333333337
unlabeled_probabilities : [3.33333333e-01 6.01617129e-36 6.66666667e-01]
  1. Tune hyperparameters

There are some hyperparameters for phase diagram estimation methods (“LP” and “LS”). These are usually given default values, and we can obtain suggestions without adjusting them. On the other hand, we may be able to obtain better suggestions by adjusting.

In “LP” and “LS”, the \(\gamma\) value of the rbf kernel can be modified. If we want to perform calculation with the value of \(\gamma\) set to 10, the calculation is performed by specifying as follows. Note that the default value of \(\gamma\) is 20. Since \(\gamma\) corresponds to the range of influence given by one training data, decreasing \(\gamma\) will affect farther away, increasing it will affect nearby.

pdc  = pdc_sampler(estimation = "LP", gamma = 10, sampling = "LC", proposal = 1)

In addition, for “LS”, there is hyperparameter \(\alpha\) which determines the probability where the label of labeled data is changed. This value is specified in 0 < \(\alpha\) < 1. If we want to perform calculation with the value of \(\alpha\) set to 0.8, the calculation is performed by specifying as follows. In “LS”, both \(\alpha\) and \(\gamma\) can be specified.

pdc  = pdc_sampler(estimation = "LS", alpha = 0.8, sampling = "LC", proposal = 1)

Specifying the name of each phase

We can specify a name of the each phase. Let FCC be the name of the phase with index 0, HCP be the name of the phase with index 1, and BCC be the name of the phase with index 2. Specify as follows.

pdc  = pdc_sampler(estimation = "LP",
                   sampling = "LC",
                   proposal = 1,
                   phase_id_option = {"FCC":0, "HCP":1, "BCC":2}
                   )

If we use this option and the calculation according to 2-4 and 6 above is performed, the following results will be output. We can see that each phase is given a name.

belonging_index_FCC: [0]
belonging_X_FCC: [[0.0, 0.0, 0.0]]
belonging_probability_FCC: [0.5]
belonging_index_HCP: [1]
belonging_X_HCP: [[0.0, 1.0, 0.0]]
belonging_probability_HCP: [1.0]
belonging_index_BCC: [5]
belonging_X_BCC: [[0.0, 1.0, 1.0]]
belonging_probability_BCC: [1.0]

Proposal when one experimental condition is fixed

By fixing one experimental condition, the next experiment may be easily performed. For example, it may be easier to experiment by fixing the temperature and changing only the pressure. We can receive a proposal with one experimental condition fixed for the previous experiment. This option is only available when proposal = 1.

To use this option, we write the code as follows.

pdc  = pdc_sampler(estimation = "LP",
                   sampling = "LC",
                   proposal = 1,
                   parameter_constraint = True,
                   prev_point = [1.00, 1.00, 1.00]
                   )
  • prev_point : The previous experimental condition is specified.

  • parameter_constraint : By setting it to True, one of the previous experimental conditions is fixed.

If we specify this option and execute the calculation according to 2-5 above, we will get the following results. The next candidate point is [1.0, 1.0, 0.0]. It can be confirmed that the proposed experimental condition is changed against we do not use this option.

proposal_index: [6]
proposal_X: [[1.0, 0.0, 1.0]]
proposal_us: [0.33333333333333337]

Proposal of multiple candidate points

Multiple experimental conditions can be proposed.

To use this option, the code is written as follows. It can be calculated by specifying the number of proposals in proposal.

pdc  = pdc_sampler(estimation = "LP",
                   sampling = "LC",
                   proposal = 2,
                   multi_method = "OU")
  • multi_method : Select an algorithm that makes multiple proposals. “OU”: Proposals are obtained in descending order of uncertainty score. “NE”: Candidate points with large uncertainty scores and far apart are proposed. If nothing is specified, “OU” is selected by default.

  • NE_k : It should be specified as an integer value when “NE” is selected. Candidate points closer than the NE_k nearest from the proposed point will not be proposed. The default is 1. In this case, multiple closest candidate points will not be proposed.

If we specify this option and execute the calculation according to 2-5 above, the following results will be output. The proposed experimental points point are [[0.0, 0.0, 0.0], [1.0, 0.0, 1.0]].

proposal_index: [0, 6]
proposal_X: [[0.0, 0.0, 0.0], [1.0, 0.0, 1.0]]
proposal_us: [0.5, 0.33333333333333337]

In addition, multiple candidate points with high probability of belonging to each phase can be proposed.

The following results are obtained by executing the calculation according to 2-6 above.

belonging_index_0: [0, 6]
belonging_X_0: [[0.0, 0.0, 0.0], [1.0, 0.0, 1.0]]
belonging_probability_0: [0.5, 0.3333333333333333]
belonging_index_1: [1, 5]
belonging_X_1: [[0.0, 1.0, 0.0], [0.0, 1.0, 1.0]]
belonging_probability_1: [1.0, 2.707277081768123e-35]
belonging_index_2: [5, 6]
belonging_X_2: [[0.0, 1.0, 1.0], [1.0, 0.0, 1.0]]
belonging_probability_2: [1.0, 0.6666666666666666]