Skip to content

kinetics estimation

Matt Johnson edited this page Feb 24, 2017 · 26 revisions

Non-Pressure Dependent Kinetics Estimation:

What reactions are considered and what rate parameters are generated are governed by the libraries, seed mechanisms and families set in the input file. First when RMG starts, all reactions from any seed mechanisms and the associated rate information are included and used within the model. As RMG runs, whenever a new reaction is generated for a given family any available libraries are first searched for rate parameter data if rate data is found it is used, otherwise it estimates the rate using the families directly.

##Reverse Direction Rates in the reverse direction are calculated from the forward rate (as defined by the family) and the equilibrium constant (as calculated from thermodynamic parameters). For purposes of this discussion we only consider the forward rate as the reverse is calculated based on the forward rate.

Family based rate estimation:

Family structure:

Families are composed of a tree or a series of trees whose nodes are groups. Each tree branches independent segments usually defining the reactants and reaction related sites although the independent segments can in practice be on any independent properties. Every Group is connected to at most 1 parent group and any number of child groups. All child groups must be

Example 1: X-H + Y. <=> Y-H + X.

This one has one tree for X-H and one tree for Y.

Example 2: H-R2R1. <=> .R2R1-H

This one has three trees one for the backbone (~) between endgroups R1 and R2, one for R1 and one for R2. The backbone in general is usually defined as all atoms between the endgroups not directly participating in the reaction.

###Groups:
Groups are more general versions of molecules. Unlike molecules groups permit extra types of nodes within the graph. For example a node in a group could be any of R,R!H,C,Cs,Cd,Cdd,Ct,CS,CO or Cb all of which could represent a carbon atom, additionally different bond structure possibilities are allowed.

for example:

Atom-Number Label GroupAtom Unpaired-Electrons Bonds

 1       *1      R!H            u1         {2,T}

 2       *4      Ct             u0         {1,T} {3,S}

 3       *6   [Cd,Ct,Cb]        u0         {2,S} {4,[D,T,B]}

 4       *5   [Cd,Ct,Cb]        u0         {3,[D,T,B]} {5,S}

 5       *2 [Cd,Cdd,Ct,CO,N,CS] u0         {4,S} {6,[D,T]}

 6       *3 [Cd,Ct,Od,Sd,Cdd,N] u0         {5,[D,T]}

This is a backbone group from Intra_R_Add_Endocyclic, the labels define the endgroups and the backbone structure (although in general not all atoms need or should be labeled). Brackets enclose all possibilities for a given GroupAtom or bond. R!H will match anything except a Hydrogen. The GroupAtoms allow flexible definition of the atoms. It is important to note that there are usually more atoms and bonds in a molecule that matches a given group than in the group itself, any group or molecule that meets the requirements of the above structure will match the group.

###Descending a tree: When a given reaction is examined every tree in the family is descended. At each level the appropriate molecules are compared with each child at that level until it finds an subgraph-isomorphic match that matches the labels appropriately. It then descends to that match's children and so on until there are no children or the reaction matches no children. This should place that reaction at a set of tree coordinates such as [R3;doublebond_intra_pri_2H;radadd_intra_cddouble].

###Family Rate Estimation Process:
When the families are loaded first the the rate rules (associated with one group from each tree in the family) are loaded and placed at the corresponding locations in the tree. Then training reactions are added by descending the trees and placing the rate data from each reaction at the associated combination node coordinates. At this point averaging is done based on the distance from nodal coordinates. This distance is defined as the sum of the distances between nodes for each tree.

For example:

[R3;doublebond_intra_pri_2H;radadd_intra_cddouble] is a distance of 1 away from [Rn;doublebond_intra_pri_2H;radadd_intra_cddouble] because R3 is one node below Rn.
while

[R3;doublebond_intra_pri_2H;radadd_intra_cddouble] is a distance of 2 away from [R3;doublebond_intra_pri;radadd_intra] because doublebond_intra_pri is one away from doublebond_intra_pri_2H and radadd_intra is a distance of one away from radadd_intra_cddouble.

In terms of these distances for each node in the tree the data present at the closest node with a rule or training reaction is placed at that node in the tree, if two or more nodes having original rate data are equal distances away the geometrical average of the data at these nodes is placed at that node. In this way rate information is placed at every node in the tree. When rate data is requested for a reaction from a given family it then descends all of the trees and takes the rate information present at its final nodal coordinates.

##Degeneracy

  • group/rate rules/libraries/ etc. (may already be in main documentation)
  • degeneracy
  • database usage
  • small code to generate reactions