Builder¶
System assembly: polymer chain construction from CGSmiles topology and monomer libraries.
Quick reference¶
| Symbol | Summary | Preferred for |
|---|---|---|
PolymerBuilder |
Build chains from CGSmiles + library + connector + placer | Full control over assembly |
polymer(cgsmiles, ...) |
CGSmiles → chain in one call | Quick prototyping |
Connector |
Port selection rules + reaction binding | Defining which ports react |
Placer |
Geometric placement (separator + orienter) | Controlling inter-monomer geometry |
CovalentSeparator |
Covalent radii-based distance | Default monomer spacing |
LinearOrienter |
Linear chain orientation | Default growth direction |
Canonical example¶
from molpy.builder.polymer import (
PolymerBuilder, Connector, Placer,
CovalentSeparator, LinearOrienter,
)
from molpy.builder import polymer
builder = PolymerBuilder(
library={"EO": eo_template},
connector=Connector(port_map={("EO","EO"): (">","<")}, reacter=rxn),
placer=Placer(separator=CovalentSeparator(buffer=-0.1),
orienter=LinearOrienter()),
)
result = builder.build("{[#EO]|10}")
chain = result.polymer
# Or use the one-call entry function:
result = polymer("{[#EO]|10}", library={"EO": eo_template}, reacter=rxn)
chain = result.polymer
Related¶
Full API¶
Crystal¶
crystal ¶
Crystal lattice builder.
Tile a Bravais lattice over a range of unit cells and (optionally) clip the
result to a geometric :class:molpy.core.region.Region.
Example
from molpy.builder import Lattice, build_crystal from molpy.core.region import BoxRegion lat = Lattice.fcc(a=3.52, species="Ni") structure = build_crystal(lat, repeats=(4, 4, 4))
or clip a 30 Å cube out of a larger tile:¶
structure = build_crystal(lat, BoxRegion(lengths=[30, 30, 30]))
Lattice ¶
Bravais lattice = cell matrix + list of basis :class:Site objects.
The cell matrix stores the three lattice vectors as rows::
cell = [[a1x, a1y, a1z],
[a2x, a2y, a2z],
[a3x, a3y, a3z]]
Construct directly with a matrix, or use :meth:from_vectors /
:meth:sc / :meth:bcc / :meth:fcc / :meth:rocksalt.
Site
dataclass
¶
Lattice basis site in fractional coordinates.
Attributes:
| Name | Type | Description |
|---|---|---|
label |
str
|
Site identifier (e.g. |
species |
str
|
Chemical species or type name (e.g. |
fractional |
tuple[float, float, float]
|
Fractional coordinates |
charge |
float
|
Partial charge (default |
attrs |
dict[str, Any] | None
|
Optional auxiliary attributes. |
build_crystal ¶
Tile lattice and (optionally) clip to a Cartesian region.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lattice
|
Lattice
|
Bravais lattice with basis sites. |
required |
region
|
Region | None
|
Geometric region in Cartesian space (e.g.
:class: |
None
|
repeats
|
tuple[int, int, int] | None
|
Number of unit cells along each lattice vector,
|
None
|
Returns:
| Type | Description |
|---|---|
Atomistic
|
class: |
Atomistic
|
full tiled super-cell ( |
Polymer¶
polymer ¶
Polymer assembly module.
Provides linear polymer assembly with both topology-only and chemical reaction connectors, plus optional geometric placement via Placer strategies.
BuildPolymer
dataclass
¶
Bases: Tool
Build a polymer chain from CGSmiles notation and a monomer library.
Preferred for
- Assembling a single chain from pre-prepared monomers.
- Iterating over a system plan to build chains one at a time.
Avoid when
- You want end-to-end build from a string (use polymer() or BuildSystem).
- You need custom reaction logic (use PolymerBuilder directly).
Attributes:
| Name | Type | Description |
|---|---|---|
reaction_preset |
str
|
Name of reaction preset (default |
use_placer |
bool
|
Enable geometric placement of monomers. |
run ¶
Build a polymer chain.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cgsmiles
|
str
|
CGSmiles notation (e.g. |
required |
library
|
dict[str, Atomistic]
|
Mapping from label to prepared Atomistic monomer. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict with |
dict[str, Any]
|
and |
BuildPolymerAmber
dataclass
¶
BuildPolymerAmber(reaction_preset='dehydration', force_field='gaff2', charge_method='bcc', conda_env=None, work_dir='amber_work')
Bases: Tool
Build a polymer chain using the AmberTools backend.
Uses antechamber, parmchk2, prepgen, and tleap to assemble a polymer from a CGSmiles string and a monomer library. Returns both MolPy structures and AMBER topology/coordinate files.
Preferred for
- Polymer systems that need AMBER force field parameters (GAFF/GAFF2).
- Workflows that feed into AMBER or LAMMPS with AMBER-style inputs.
Avoid when
- You do not need force field parameters (use BuildPolymer).
- AmberTools is not installed.
Attributes:
| Name | Type | Description |
|---|---|---|
reaction_preset |
str | None
|
Named preset for leaving group detection. When None, hydrogen atoms bonded to port atoms are auto-detected. |
force_field |
str
|
Amber force field ( |
charge_method |
str
|
Antechamber charge method. |
conda_env |
str | None
|
Conda environment containing AmberTools. |
work_dir |
str
|
Directory for intermediate files. |
run ¶
Build a polymer using AmberTools.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cgsmiles
|
str
|
CGSmiles notation (e.g. |
required |
library
|
dict[str, Atomistic]
|
Mapping from label to prepared Atomistic monomer.
Each monomer must have |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict with |
dict[str, Any]
|
|
BuildSystem
dataclass
¶
Bases: Tool
End-to-end polymer system construction from G-BigSMILES.
Parses a G-BigSMILES string and delegates to the GBigSmilesCompiler to produce a list of Atomistic chains.
Preferred for
- Building a complete polydisperse system in one call.
- When you do not need to inspect the system plan before building.
Avoid when
- You need to inspect or modify the plan first (use PlanSystem + BuildPolymer).
- You need the Amber backend (use BuildPolymerAmber).
Attributes:
| Name | Type | Description |
|---|---|---|
reaction_preset |
str
|
Name of reaction preset. |
add_hydrogens |
bool
|
Add explicit hydrogens during monomer preparation. |
optimize |
bool
|
Optimize monomer geometry. |
random_seed |
int | None
|
Random seed for reproducibility. |
Chain
dataclass
¶
Represents a single polymer chain.
Attributes:
| Name | Type | Description |
|---|---|---|
dp |
int
|
Degree of polymerization (number of monomers) |
monomers |
list[str]
|
List of monomer identifiers in the chain |
mass |
float
|
Total mass of the chain (g/mol) |
Connector ¶
Select ports and execute reactions between adjacent monomers.
Port selection strategy (applied in order):
1. Explicit port_map lookup for (left_label, right_label)
2. Compatibility: > on left pairs with < on right
3. Single-port: each side has exactly one unconsumed port
4. Common name: both sides share a port name (for $ ports)
5. Raise AmbiguousPortsError
connect ¶
Execute the chemical reaction between two structures.
select_ports ¶
Select which ports to connect.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left
|
Atomistic
|
Left Atomistic structure. |
required |
right
|
Atomistic
|
Right Atomistic structure. |
required |
left_ports
|
Mapping[str, list[Atom]]
|
Available ports on left (name -> list[Atom]). |
required |
right_ports
|
Mapping[str, list[Atom]]
|
Available ports on right (name -> list[Atom]). |
required |
ctx
|
ConnectorContext
|
Context with step info and labels. |
required |
Returns:
| Type | Description |
|---|---|
tuple[str, int, str, int, None]
|
(left_port_name, left_idx, right_port_name, right_idx, None) |
ConnectorContext ¶
Bases: dict[str, Any]
Shared context passed to the connector during linear build.
Keys: - step: int (current connection step index) - left_label: str (label of left monomer) - right_label: str (label of right monomer) - sequence: list[str] (full sequence being built)
CovalentSeparator ¶
Separator based on typical bond lengths (for bonded atoms).
Uses realistic bond lengths based on element types. Typical bond lengths: - C-C: 1.54 Å (single), 1.34 Å (double) - C-O: 1.43 Å (single), 1.23 Å (double) - C-N: 1.47 Å (single) - O-H: 0.96 Å - N-H: 1.01 Å
Initialize covalent separator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
buffer
|
float
|
Additional buffer distance in Angstroms (default: 0.0) Can be negative to account for slight compression |
0.0
|
get_separation ¶
Calculate separation based on typical bond lengths.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left_struct
|
Atomistic
|
Previous structure in sequence |
required |
right_struct
|
Atomistic
|
Next structure to place |
required |
left_port
|
Atom
|
Connection port on left structure |
required |
right_port
|
Atom
|
Connection port on right structure |
required |
Returns:
| Type | Description |
|---|---|
float
|
Separation distance = typical_bond_length + buffer |
DPDistribution ¶
Bases: Protocol
Protocol for distributions that sample degree of polymerization directly.
Distributions implementing this protocol can sample DP values without requiring monomer mass information. This is suitable for distributions defined in DP space (e.g., Poisson, Uniform).
dp_pmf ¶
Probability mass function for DP values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dp_array
|
ndarray
|
Array of DP values |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Array of probability mass values |
sample_dp ¶
Sample degree of polymerization from distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
NumPy random number generator |
required |
Returns:
| Type | Description |
|---|---|
int
|
Degree of polymerization (>= 1) |
FlorySchulzPolydisperse ¶
Flory-Schulz (geometric) distribution for degree of polymerization.
PMF: P(N = k) = a^2 * k * (1 - a)^(k-1), k = 1, 2, ...
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
a
|
float
|
Probability parameter (0 < a < 1), related to extent of reaction. |
required |
random_seed
|
int | None
|
Optional random seed. |
None
|
GrowthKernel ¶
Bases: Protocol
Protocol for local transition function in port-level stochastic growth.
A GrowthKernel decides which monomer (if any) to add next for a given reactive port on the growing polymer. This encapsulates the reaction probability logic from G-BigSMILES notation.
choose_next_for_port ¶
Choose next monomer for a given port.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
polymer
|
Atomistic
|
Current polymer structure |
required |
port
|
Atom
|
Port to extend from |
required |
candidates
|
Sequence[MonomerTemplate]
|
Available monomer templates |
required |
rng
|
Generator | None
|
Random number generator for sampling |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
MonomerPlacement |
MonomerPlacement | None
|
Add this template at target port |
None |
MonomerPlacement | None
|
Terminate this port (implicit end-group) |
LinearOrienter ¶
Orienter for linear polymer arrangement.
Aligns the next monomer so that: 1. The two port atoms are separated by the specified distance 2. The port connection axis of the next monomer aligns with the port connection axis of the previous monomer 3. The monomer extends in a linear fashion
get_orientation ¶
Calculate linear alignment transformation.
Strategy: 1. Get direction vector from left port anchor (outward) 2. Place right structure so its port anchor is at the target position 3. Align right structure's port direction with left port direction
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left_struct
|
Atomistic
|
Previous structure in sequence |
required |
right_struct
|
Atomistic
|
Next structure to place |
required |
left_port
|
Atom
|
Connection port on left structure |
required |
right_port
|
Atom
|
Connection port on right structure |
required |
separation
|
float
|
Distance between port anchors |
required |
Returns:
| Type | Description |
|---|---|
tuple[ndarray, ndarray]
|
Tuple of (translation_vector, rotation_matrix) |
MassDistribution ¶
Bases: Protocol
Protocol for distributions that sample molecular weight directly.
Distributions implementing this protocol sample mass values directly from the distribution without converting through DP. This is suitable for distributions defined in mass space (e.g., Schulz-Zimm).
mass_pdf ¶
Probability density function for mass values.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mass_array
|
ndarray
|
Array of mass values (g/mol) |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Array of probability density values |
sample_mass ¶
Sample molecular weight from distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
NumPy random number generator |
required |
Returns:
| Type | Description |
|---|---|
float
|
Molecular weight (g/mol, > 0) |
MonomerPlacement
dataclass
¶
Decision for next monomer placement during stochastic growth.
Represents the output of a GrowthKernel's decision: which template to add and which port on that template to connect.
Attributes:
| Name | Type | Description |
|---|---|---|
template |
MonomerTemplate
|
MonomerTemplate to add |
target_descriptor_id |
int
|
Which port descriptor on the new monomer to connect |
Example
placement = MonomerPlacement( ... template=eo_template, ... target_descriptor_id=1 # Connect via port descriptor 1 ... ) print(f"Add {placement.template.label} at port {placement.target_descriptor_id}")
MonomerTemplate
dataclass
¶
Template for a monomer with port descriptors and metadata.
This represents a monomer type that can be instantiated multiple times during stochastic growth. Each instantiation creates a fresh copy of the structure.
Attributes:
| Name | Type | Description |
|---|---|---|
label |
str
|
Monomer label (e.g., "EO2", "PS") |
structure |
Atomistic
|
Base Atomistic structure (will be copied on instantiation) |
port_descriptors |
dict[int, PortDescriptor]
|
Mapping from descriptor_id to PortDescriptor |
mass |
float
|
Molecular weight (g/mol) |
metadata |
dict[str, Any]
|
Additional metadata (optional) |
Example
template = MonomerTemplate( ... label="EO", ... structure=eo_monomer, ... port_descriptors={ ... 0: PortDescriptor(0, "<", role="left"), ... 1: PortDescriptor(1, ">", role="right"), ... }, ... mass=44.05, ... ) fresh_copy = template.instantiate() print(f"Template: {template.label}, mass={template.mass} g/mol")
get_all_descriptors ¶
Get all port descriptors for this template.
Returns:
| Type | Description |
|---|---|
list[PortDescriptor]
|
List of all PortDescriptor objects sorted by descriptor_id |
Example
template = MonomerTemplate(...) descriptors = template.get_all_descriptors() for desc in descriptors: ... print(f"Port {desc.descriptor_id}: {desc.port_name}")
get_port_by_descriptor ¶
Get port descriptor for a specific descriptor ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
descriptor_id
|
int
|
Descriptor ID to look up |
required |
Returns:
| Type | Description |
|---|---|
PortDescriptor | None
|
PortDescriptor if found, None otherwise |
Example
template = MonomerTemplate(...) left_port = template.get_port_by_descriptor(0) if left_port: ... print(f"Port: {left_port.port_name}, role: {left_port.role}")
instantiate ¶
Create a fresh copy of the structure.
Each instantiation is independent with separate atoms and bonds, allowing the same template to be used multiple times in a polymer.
Returns:
| Type | Description |
|---|---|
Atomistic
|
New Atomistic instance with independent atoms and bonds |
Example
template = MonomerTemplate(label="EO", structure=eo_monomer, ...) copy1 = template.instantiate() copy2 = template.instantiate() copy1 is not copy2 # Different objects True
Placer ¶
Combined placer for positioning structures during assembly.
Uses a Separator to determine distance and an Orienter to determine orientation.
Initialize placer.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
separator
|
Separator
|
Separator for calculating distance |
required |
orienter
|
LinearOrienter
|
Orienter for calculating orientation |
required |
place_monomer ¶
Position right_struct relative to left_struct.
Modifies right_struct's atomic coordinates in-place.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left_struct
|
Atomistic
|
Previous structure in sequence |
required |
right_struct
|
Atomistic
|
Next structure to place |
required |
left_port
|
Atom
|
Connection port on left structure |
required |
right_port
|
Atom
|
Connection port on right structure |
required |
PlanSystem
dataclass
¶
Bases: Tool
Plan a polydisperse polymer system from distribution parameters.
Returns chain specifications (DP, monomer sequence, mass) without creating any atoms. Use this to validate distribution parameters before committing to an expensive build.
Preferred for
- Previewing system composition before building.
- Iterating on distribution parameters cheaply.
Avoid when
- You want chains built directly (use BuildSystem or polymer_system).
Attributes:
| Name | Type | Description |
|---|---|---|
random_seed |
int | None
|
Random seed for reproducibility. |
run ¶
run(monomer_weights, monomer_mass, distribution_type, distribution_params, target_total_mass, end_group_mass=0.0, max_rel_error=0.02)
Plan a polydisperse polymer system.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
monomer_weights
|
dict[str, float]
|
Weight fractions for each monomer label. |
required |
monomer_mass
|
dict[str, float]
|
Molar mass (g/mol) per monomer label. |
required |
distribution_type
|
str
|
Distribution name (e.g. |
required |
distribution_params
|
dict[str, float]
|
Distribution parameters as |
required |
target_total_mass
|
float
|
Target total system mass (g/mol). |
required |
end_group_mass
|
float
|
Mass of end groups per chain (g/mol). |
0.0
|
max_rel_error
|
float
|
Maximum relative error for total mass. |
0.02
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict with |
dict[str, Any]
|
and |
PoissonPolydisperse ¶
Poisson distribution for the degree of polymerization (DP).
Zero-truncated: sampled k=0 is mapped to k=1.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lambda_param
|
float
|
Mean of the Poisson distribution (> 0). |
required |
random_seed
|
int | None
|
Optional random seed. |
None
|
PolydisperseChainGenerator ¶
Middle layer: Chain-level generator.
Responsible for: - Sampling chain size: - Either in DP-space via a DPDistribution (sample_dp) - Or in mass-space via a MassDistribution (sample_mass) - Using a SequenceGenerator to build the chain sequence - Computing the mass of a chain using monomer mass table and optional end-group mass
Does NOT know anything about total system mass. Only returns one chain at a time.
Initialize polydisperse chain generator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seq_generator
|
SequenceGenerator
|
Sequence generator for generating monomer sequences |
required |
monomer_mass
|
dict[str, float]
|
Dictionary mapping monomer identifiers to their masses (g/mol) |
required |
end_group_mass
|
float
|
Mass of end groups (g/mol), default 0.0 |
0.0
|
distribution
|
DPDistribution | MassDistribution | None
|
Distribution implementing DPDistribution or MassDistribution protocol |
None
|
build_chain ¶
Sample DP, generate monomer sequence, and compute mass.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
np.random.Generator number generator |
required |
Returns:
| Type | Description |
|---|---|
Chain
|
Chain object with dp, monomers, and mass |
sample_dp ¶
Sample a degree of polymerization from the distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
np.random.Generator number generator |
required |
Returns:
| Type | Description |
|---|---|
int
|
Degree of polymerization (>= 1) |
sample_mass ¶
Sample a target chain mass from a mass-based distribution.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
np.random.Generator number generator |
required |
Returns:
| Type | Description |
|---|---|
float
|
Target chain mass in g/mol (>= 0) |
PolymerBuildResult
dataclass
¶
Result of building a polymer.
PolymerBuilder ¶
Build polymers from CGSmiles notation with support for arbitrary topologies.
This builder parses CGSmiles strings and constructs polymers using a graph-based
approach, supporting:
- Linear chains: {[#A][#B][#C]}
- Branched structures: {[#A]([#B])[#C]}
- Cyclic structures: {[#A]1[#B][#C]1}
- Repeat operators: {[#A]|10}
Example
builder = PolymerBuilder( ... library={"EO2": eo2_monomer, "PS": ps_monomer}, ... connector=connector, ... typifier=typifier, ... ) result = builder.build("{[#EO2]|8[#PS]}")
Initialize the polymer builder.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
library
|
Mapping[str, Atomistic]
|
Mapping from CGSmiles labels to Atomistic monomer structures |
required |
connector
|
Connector
|
Connector for port selection and chemical reactions |
required |
typifier
|
TypifierBase | None
|
Optional typifier for automatic retypification |
None
|
placer
|
Placer | None
|
Optional Placer for positioning structures before connection |
None
|
build ¶
Build a polymer from a CGSmiles string.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cgsmiles
|
str
|
CGSmiles notation string (e.g., "{[#EO2]|8[#PS]}") |
required |
Returns:
| Type | Description |
|---|---|
PolymerBuildResult
|
PolymerBuildResult containing the assembled polymer and metadata |
Raises:
| Type | Description |
|---|---|
ValueError
|
If CGSmiles is invalid |
SequenceError
|
If labels in CGSmiles are not found in library |
PortDescriptor
dataclass
¶
Descriptor for a reactive port on a monomer template.
Port descriptors identify ports with unique IDs and store metadata about port behavior (role, bond type, compatibility).
Attributes:
| Name | Type | Description |
|---|---|---|
descriptor_id |
int
|
Unique ID within template (e.g., 0, 1, 2) |
port_name |
str
|
Port name on atom (e.g., "<", ">", "branch") |
role |
str | None
|
Port role (e.g., "left", "right", "branch") |
bond_kind |
str | None
|
Bond type (e.g., "-", "=", "#") |
compat |
set[str] | None
|
Compatibility set for port matching |
Example
desc = PortDescriptor( ... descriptor_id=0, ... port_name="<", ... role="left", ... bond_kind="-", ... compat={"donor"} ... ) print(f"Descriptor {desc.descriptor_id}: port '{desc.port_name}' ({desc.role})")
PrepareMonomer
dataclass
¶
Bases: Tool
Parse a BigSMILES monomer string and produce an Atomistic structure.
Pipeline: parse BigSMILES → convert to Atomistic with port markers → generate 3D coordinates via RDKit (if available) → compute angles/dihedrals.
Preferred for
- Preparing monomers for BuildPolymer or polymer().
- One-step SMILES-to-3D when you need port annotations.
Avoid when
- You already have an Atomistic struct (use RDKit adapter directly).
- You need custom 3D embedding parameters (use Generate3D).
Attributes:
| Name | Type | Description |
|---|---|---|
add_hydrogens |
bool
|
Add explicit hydrogens during 3D generation. |
optimize |
bool
|
Optimize geometry after 3D embedding. |
gen_topology |
bool
|
Compute angles and dihedrals. |
ProbabilityTableKernel ¶
GrowthKernel based on G-BigSMILES probability tables.
This kernel uses pre-computed probability tables that map each port descriptor to weighted choices over (template, target_descriptor_id) pairs. Weights are integers that are normalized to probabilities during sampling.
Initialize probability table kernel.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
probability_tables
|
dict[int, list[tuple[MonomerTemplate, int, int]]]
|
Maps descriptor_id -> [(template, target_desc, integer_weight)] Integer weights are normalized to probabilities during sampling. |
required |
end_group_templates
|
dict[int, MonomerTemplate] | None
|
Maps descriptor_id -> end-group template (no ports) |
None
|
choose_next_for_port ¶
Choose next monomer based on probability table.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
polymer
|
Atomistic
|
Current polymer structure |
required |
port
|
Atom
|
Port to extend from |
required |
candidates
|
Sequence[MonomerTemplate]
|
Available monomer templates |
required |
rng
|
Generator | None
|
Random number generator (uses default if None) |
None
|
Returns:
| Type | Description |
|---|---|
MonomerPlacement | None
|
MonomerPlacement or None (terminate) |
SchulzZimmPolydisperse ¶
Schulz-Zimm molecular weight distribution for polydisperse polymer chains.
Implements :class:MassDistribution - sampling is done directly in
molecular-weight space.
The probability density is:
.. math::
f(M) = \frac{z^{z+1}}{\Gamma(z+1)}
\frac{M^{z-1}}{M_n^{z}}
\exp\left(-\frac{z M}{M_n}\right),
where z = Mn / (Mw - Mn). This is equivalent to a Gamma distribution with shape z and scale theta = Mw - Mn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
Mn
|
float
|
Number-average molecular weight (g/mol). |
required |
Mw
|
float
|
Weight-average molecular weight (g/mol), must satisfy Mw > Mn. |
required |
random_seed
|
int | None
|
Optional random seed. |
None
|
SequenceGenerator ¶
Bases: Protocol
Protocol for sequence generators.
A sequence generator controls how monomers are arranged in a single chain.
expected_composition ¶
Return expected long-chain monomer fractions.
Returns:
| Type | Description |
|---|---|
dict[str, float]
|
Dictionary mapping monomer identifiers to expected fractions |
generate_sequence ¶
Generate a monomer sequence of specified degree of polymerization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dp
|
int
|
Degree of polymerization (number of monomers) |
required |
rng
|
Generator
|
numpy random Generator |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of monomer identifiers (strings) |
StochasticChain
dataclass
¶
Result of stochastic BFS growth.
Contains the assembled polymer structure along with metadata about the growth process.
Attributes:
| Name | Type | Description |
|---|---|---|
polymer |
Atomistic
|
The assembled Atomistic structure |
dp |
int
|
Degree of polymerization (number of monomers added) |
mass |
float
|
Total molecular weight (g/mol) |
growth_history |
list[dict[str, Any]]
|
Metadata for each monomer addition step |
Example
chain = StochasticChain( ... polymer=final_structure, ... dp=25, ... mass=1101.25, ... growth_history=[...] ... ) print(f"Built polymer: DP={chain.dp}, mass={chain.mass:.1f} g/mol")
SystemPlan
dataclass
¶
Represents a complete system plan with all chains.
Attributes:
| Name | Type | Description |
|---|---|---|
chains |
list[Chain]
|
List of all chains in the system |
total_mass |
float
|
Total mass of all chains (g/mol) |
target_mass |
float
|
Target total mass that was requested (g/mol) |
SystemPlanner ¶
SystemPlanner(chain_generator, target_total_mass, max_rel_error=0.02, max_chains=None, enable_trimming=True)
Top layer: System-level planner.
Responsible for: - Enforcing a target total mass for the overall system - Iteratively requesting chains from PolydisperseChainGenerator - Maintaining a running sum of total mass - Stopping when mass reaches target window, and optionally trimming the final chain
Does NOT micromanage sequence probabilities or DP distribution; only orchestrates at the ensemble level.
Initialize system planner.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
chain_generator
|
PolydisperseChainGenerator
|
Chain generator for building chains |
required |
target_total_mass
|
float
|
Target total system mass (g/mol) |
required |
max_rel_error
|
float
|
Maximum relative error allowed (default 0.02 = 2%) |
0.02
|
max_chains
|
int | None
|
Maximum number of chains to generate (None = no limit) |
None
|
enable_trimming
|
bool
|
Whether to enable chain trimming to better hit target mass |
True
|
plan_system ¶
Repeatedly ask chain_generator for new chains until accumulated mass reaches target_total_mass within max_rel_error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rng
|
Generator
|
np.random.Generator number generator |
required |
Returns:
| Type | Description |
|---|---|
SystemPlan
|
SystemPlan with all chains and total mass |
UniformPolydisperse ¶
Uniform distribution over degree of polymerization (DP).
All integer DP values between min_dp and max_dp (inclusive) are equally likely.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
min_dp
|
int
|
Lower bound (>= 1). |
required |
max_dp
|
int
|
Upper bound (>= min_dp). |
required |
random_seed
|
int | None
|
Optional random seed. |
None
|
VdWSeparator ¶
Separator based on van der Waals radii.
Calculates separation as sum of VdW radii of the two port anchor atoms, plus an optional buffer distance.
NOTE: VdW radii are designed for non-bonded contacts (~3-4 Å). For bonded atoms, use CovalentSeparator instead.
Initialize VdW separator.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
buffer
|
float
|
Additional buffer distance in Angstroms (default: 0.0) |
0.0
|
get_separation ¶
Calculate separation based on VdW radii.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
left_struct
|
Atomistic
|
Previous structure in sequence |
required |
right_struct
|
Atomistic
|
Next structure to place |
required |
left_port
|
Atom
|
Connection port on left structure |
required |
right_port
|
Atom
|
Connection port on right structure |
required |
Returns:
| Type | Description |
|---|---|
float
|
Separation distance = vdw_left + vdw_right + buffer |
WeightedSequenceGenerator ¶
Sequence generator based on monomer weights/proportions.
Each selection is independent (no memory of previous selections).
generate_sequence ¶
Generate a sequence of specified degree of polymerization.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dp
|
int
|
Degree of polymerization (number of monomers) |
required |
rng
|
Generator
|
numpy random Generator |
required |
Returns:
| Type | Description |
|---|---|
list[str]
|
List of monomer identifiers |
generate_3d ¶
Generate 3D coordinates for a molecular structure via RDKit.
Thin re-export of :func:molpy.adapter.rdkit.generate_3d for use inside
polymer-building workflows.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mol
|
Atomistic
|
Atomistic structure (typically from parser.parse_molecule) |
required |
add_hydrogens
|
bool
|
Add implicit hydrogens before embedding |
True
|
optimize
|
bool
|
Run force-field geometry optimization after embedding |
True
|
Returns:
| Type | Description |
|---|---|
Atomistic
|
New Atomistic with 3D coordinates and (optionally) explicit hydrogens |
Raises:
| Type | Description |
|---|---|
ImportError
|
if RDKit is not installed |
polymer ¶
polymer(spec, *, library=None, reaction_preset='dehydration', use_placer=True, add_hydrogens=True, optimize=True, random_seed=None, backend='default', amber_config=None)
Build a single polymer chain from a string specification.
Auto-detects notation type (for the default backend):
- G-BigSMILES (contains
|annotation):polymer("{[<]CCOCC[>]}|10|") - CGSmiles + inline fragments (contains
.{#):polymer("{[#EO]|10}.{#EO=[<]COC[>]}") - Pure CGSmiles (requires
librarykwarg):polymer("{[#EO]|10}", library={"EO": eo_monomer})
For the Amber backend:
polymer("{[#EO]|10}", library={"EO": eo}, backend="amber")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec
|
str
|
Polymer specification string. |
required |
library
|
Mapping[str, Atomistic] | None
|
Monomer library (required for pure CGSmiles and Amber). |
None
|
reaction_preset
|
str
|
Reaction preset name. |
'dehydration'
|
use_placer
|
bool
|
Enable geometric placement (default backend only). |
True
|
add_hydrogens
|
bool
|
Add hydrogens during 3D generation. |
True
|
optimize
|
bool
|
Optimize geometry. |
True
|
random_seed
|
int | None
|
Random seed for reproducibility. |
None
|
backend
|
Backend
|
Builder backend — |
'default'
|
amber_config
|
Any
|
Optional |
None
|
Returns:
| Type | Description |
|---|---|
Atomistic | Any
|
Atomistic (default backend) or AmberBuildResult (amber backend). |
polymer_system ¶
polymer_system(spec, *, reaction_preset='dehydration', add_hydrogens=True, optimize=True, random_seed=None)
Build a multi-chain polymer system from G-BigSMILES.
Example::
chains = polymer_system(
"{[<]CCOCC[>]}|schulz_zimm(1500,3000)||5e5|",
random_seed=42,
)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec
|
str
|
G-BigSMILES specification string. |
required |
reaction_preset
|
str
|
Reaction preset name. |
'dehydration'
|
add_hydrogens
|
bool
|
Add hydrogens during 3D generation. |
True
|
optimize
|
bool
|
Optimize geometry. |
True
|
random_seed
|
int | None
|
Random seed for reproducibility. |
None
|
Returns:
| Type | Description |
|---|---|
list[Atomistic]
|
List of Atomistic structures (one per chain). |
prepare_monomer ¶
prepare_monomer(bigsmiles, typifier=None, *, add_hydrogens=True, optimize=True, gen_angle=True, gen_dihe=True)
Parse, embed in 3D, augment topology, and optionally typify a monomer.
Bundles the four-step pattern that appears in every polymer-building workflow::
m = mp.parser.parse_monomer(bigsmiles)
m = generate_3d(m, add_hydrogens=True, optimize=True)
m = m.get_topo(gen_angle=True, gen_dihe=True)
m = typifier.typify(m)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bigsmiles
|
str
|
BigSMILES string (e.g. |
required |
typifier
|
Optional typifier instance (e.g. |
None
|
|
add_hydrogens
|
bool
|
Add implicit hydrogens during 3D generation. |
True
|
optimize
|
bool
|
Run force-field geometry optimisation after embedding. |
True
|
gen_angle
|
bool
|
Generate angle interactions from bonds. |
True
|
gen_dihe
|
bool
|
Generate dihedral interactions from bonds. |
True
|
Returns:
| Type | Description |
|---|---|
Atomistic
|
Fully prepared Atomistic monomer ready for reactions or export. |
Polymer DSL tools¶
High-level polymer-building tools and entry functions (PrepareMonomer,
BuildPolymer, PlanSystem, BuildSystem, BuildPolymerAmber, polymer,
polymer_system, prepare_monomer, generate_3d).
dsl ¶
Polymer building tools.
Tools that wrap the parser, adapter, builder, and reacter modules into single-call workflows for common polymer construction tasks.
Tools (auto-registered in ToolRegistry):
- PrepareMonomer — BigSMILES → 3D Atomistic with ports
- BuildPolymer — CGSmiles + library → assembled chain
- PlanSystem — distribution parameters → chain plan (no atoms)
- BuildSystem — G-BigSMILES → list of built chains
Convenience functions:
- polymer() — auto-detect notation, build single chain
- polymer_system() — G-BigSMILES → multi-chain system
BuildPolymer
dataclass
¶
Bases: Tool
Build a polymer chain from CGSmiles notation and a monomer library.
Preferred for
- Assembling a single chain from pre-prepared monomers.
- Iterating over a system plan to build chains one at a time.
Avoid when
- You want end-to-end build from a string (use polymer() or BuildSystem).
- You need custom reaction logic (use PolymerBuilder directly).
Attributes:
| Name | Type | Description |
|---|---|---|
reaction_preset |
str
|
Name of reaction preset (default |
use_placer |
bool
|
Enable geometric placement of monomers. |
run ¶
Build a polymer chain.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cgsmiles
|
str
|
CGSmiles notation (e.g. |
required |
library
|
dict[str, Atomistic]
|
Mapping from label to prepared Atomistic monomer. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict with |
dict[str, Any]
|
and |
BuildPolymerAmber
dataclass
¶
BuildPolymerAmber(reaction_preset='dehydration', force_field='gaff2', charge_method='bcc', conda_env=None, work_dir='amber_work')
Bases: Tool
Build a polymer chain using the AmberTools backend.
Uses antechamber, parmchk2, prepgen, and tleap to assemble a polymer from a CGSmiles string and a monomer library. Returns both MolPy structures and AMBER topology/coordinate files.
Preferred for
- Polymer systems that need AMBER force field parameters (GAFF/GAFF2).
- Workflows that feed into AMBER or LAMMPS with AMBER-style inputs.
Avoid when
- You do not need force field parameters (use BuildPolymer).
- AmberTools is not installed.
Attributes:
| Name | Type | Description |
|---|---|---|
reaction_preset |
str | None
|
Named preset for leaving group detection. When None, hydrogen atoms bonded to port atoms are auto-detected. |
force_field |
str
|
Amber force field ( |
charge_method |
str
|
Antechamber charge method. |
conda_env |
str | None
|
Conda environment containing AmberTools. |
work_dir |
str
|
Directory for intermediate files. |
run ¶
Build a polymer using AmberTools.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
cgsmiles
|
str
|
CGSmiles notation (e.g. |
required |
library
|
dict[str, Atomistic]
|
Mapping from label to prepared Atomistic monomer.
Each monomer must have |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict with |
dict[str, Any]
|
|
BuildSystem
dataclass
¶
Bases: Tool
End-to-end polymer system construction from G-BigSMILES.
Parses a G-BigSMILES string and delegates to the GBigSmilesCompiler to produce a list of Atomistic chains.
Preferred for
- Building a complete polydisperse system in one call.
- When you do not need to inspect the system plan before building.
Avoid when
- You need to inspect or modify the plan first (use PlanSystem + BuildPolymer).
- You need the Amber backend (use BuildPolymerAmber).
Attributes:
| Name | Type | Description |
|---|---|---|
reaction_preset |
str
|
Name of reaction preset. |
add_hydrogens |
bool
|
Add explicit hydrogens during monomer preparation. |
optimize |
bool
|
Optimize monomer geometry. |
random_seed |
int | None
|
Random seed for reproducibility. |
PlanSystem
dataclass
¶
Bases: Tool
Plan a polydisperse polymer system from distribution parameters.
Returns chain specifications (DP, monomer sequence, mass) without creating any atoms. Use this to validate distribution parameters before committing to an expensive build.
Preferred for
- Previewing system composition before building.
- Iterating on distribution parameters cheaply.
Avoid when
- You want chains built directly (use BuildSystem or polymer_system).
Attributes:
| Name | Type | Description |
|---|---|---|
random_seed |
int | None
|
Random seed for reproducibility. |
run ¶
run(monomer_weights, monomer_mass, distribution_type, distribution_params, target_total_mass, end_group_mass=0.0, max_rel_error=0.02)
Plan a polydisperse polymer system.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
monomer_weights
|
dict[str, float]
|
Weight fractions for each monomer label. |
required |
monomer_mass
|
dict[str, float]
|
Molar mass (g/mol) per monomer label. |
required |
distribution_type
|
str
|
Distribution name (e.g. |
required |
distribution_params
|
dict[str, float]
|
Distribution parameters as |
required |
target_total_mass
|
float
|
Target total system mass (g/mol). |
required |
end_group_mass
|
float
|
Mass of end groups per chain (g/mol). |
0.0
|
max_rel_error
|
float
|
Maximum relative error for total mass. |
0.02
|
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
Dict with |
dict[str, Any]
|
and |
PrepareMonomer
dataclass
¶
Bases: Tool
Parse a BigSMILES monomer string and produce an Atomistic structure.
Pipeline: parse BigSMILES → convert to Atomistic with port markers → generate 3D coordinates via RDKit (if available) → compute angles/dihedrals.
Preferred for
- Preparing monomers for BuildPolymer or polymer().
- One-step SMILES-to-3D when you need port annotations.
Avoid when
- You already have an Atomistic struct (use RDKit adapter directly).
- You need custom 3D embedding parameters (use Generate3D).
Attributes:
| Name | Type | Description |
|---|---|---|
add_hydrogens |
bool
|
Add explicit hydrogens during 3D generation. |
optimize |
bool
|
Optimize geometry after 3D embedding. |
gen_topology |
bool
|
Compute angles and dihedrals. |
generate_3d ¶
Generate 3D coordinates for a molecular structure via RDKit.
Thin re-export of :func:molpy.adapter.rdkit.generate_3d for use inside
polymer-building workflows.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
mol
|
Atomistic
|
Atomistic structure (typically from parser.parse_molecule) |
required |
add_hydrogens
|
bool
|
Add implicit hydrogens before embedding |
True
|
optimize
|
bool
|
Run force-field geometry optimization after embedding |
True
|
Returns:
| Type | Description |
|---|---|
Atomistic
|
New Atomistic with 3D coordinates and (optionally) explicit hydrogens |
Raises:
| Type | Description |
|---|---|
ImportError
|
if RDKit is not installed |
polymer ¶
polymer(spec, *, library=None, reaction_preset='dehydration', use_placer=True, add_hydrogens=True, optimize=True, random_seed=None, backend='default', amber_config=None)
Build a single polymer chain from a string specification.
Auto-detects notation type (for the default backend):
- G-BigSMILES (contains
|annotation):polymer("{[<]CCOCC[>]}|10|") - CGSmiles + inline fragments (contains
.{#):polymer("{[#EO]|10}.{#EO=[<]COC[>]}") - Pure CGSmiles (requires
librarykwarg):polymer("{[#EO]|10}", library={"EO": eo_monomer})
For the Amber backend:
polymer("{[#EO]|10}", library={"EO": eo}, backend="amber")
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec
|
str
|
Polymer specification string. |
required |
library
|
Mapping[str, Atomistic] | None
|
Monomer library (required for pure CGSmiles and Amber). |
None
|
reaction_preset
|
str
|
Reaction preset name. |
'dehydration'
|
use_placer
|
bool
|
Enable geometric placement (default backend only). |
True
|
add_hydrogens
|
bool
|
Add hydrogens during 3D generation. |
True
|
optimize
|
bool
|
Optimize geometry. |
True
|
random_seed
|
int | None
|
Random seed for reproducibility. |
None
|
backend
|
Backend
|
Builder backend — |
'default'
|
amber_config
|
Any
|
Optional |
None
|
Returns:
| Type | Description |
|---|---|
Atomistic | Any
|
Atomistic (default backend) or AmberBuildResult (amber backend). |
polymer_system ¶
polymer_system(spec, *, reaction_preset='dehydration', add_hydrogens=True, optimize=True, random_seed=None)
Build a multi-chain polymer system from G-BigSMILES.
Example::
chains = polymer_system(
"{[<]CCOCC[>]}|schulz_zimm(1500,3000)||5e5|",
random_seed=42,
)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
spec
|
str
|
G-BigSMILES specification string. |
required |
reaction_preset
|
str
|
Reaction preset name. |
'dehydration'
|
add_hydrogens
|
bool
|
Add hydrogens during 3D generation. |
True
|
optimize
|
bool
|
Optimize geometry. |
True
|
random_seed
|
int | None
|
Random seed for reproducibility. |
None
|
Returns:
| Type | Description |
|---|---|
list[Atomistic]
|
List of Atomistic structures (one per chain). |
prepare_monomer ¶
prepare_monomer(bigsmiles, typifier=None, *, add_hydrogens=True, optimize=True, gen_angle=True, gen_dihe=True)
Parse, embed in 3D, augment topology, and optionally typify a monomer.
Bundles the four-step pattern that appears in every polymer-building workflow::
m = mp.parser.parse_monomer(bigsmiles)
m = generate_3d(m, add_hydrogens=True, optimize=True)
m = m.get_topo(gen_angle=True, gen_dihe=True)
m = typifier.typify(m)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
bigsmiles
|
str
|
BigSMILES string (e.g. |
required |
typifier
|
Optional typifier instance (e.g. |
None
|
|
add_hydrogens
|
bool
|
Add implicit hydrogens during 3D generation. |
True
|
optimize
|
bool
|
Run force-field geometry optimisation after embedding. |
True
|
gen_angle
|
bool
|
Generate angle interactions from bonds. |
True
|
gen_dihe
|
bool
|
Generate dihedral interactions from bonds. |
True
|
Returns:
| Type | Description |
|---|---|
Atomistic
|
Fully prepared Atomistic monomer ready for reactions or export. |
Tool framework¶
Tool and ToolRegistry are the internal base classes that the builder
DSL tools are built on. They are not public top-level exports.
_tool ¶
Tool framework for executable builder operations.
Provides:
ToolRegistry: auto-discovery registry forToolsubclassesTool: frozen-dataclass ABC for executable tools (builders, transforms)
Tool
dataclass
¶
Bases: ABC
Base class for executable tools (builders, transforms).
Concrete subclasses are auto-registered in ToolRegistry and
discovered by the MCP server. Tool is intended for molecular
operations that produce or transform structures.
Usage::
@dataclass(frozen=True)
class MyTool(Tool):
param: int = 10
def run(self, input: str) -> dict:
return {"result": input, "param": self.param}
tool = MyTool(param=5)
result = tool("hello") # delegates to run()
ToolRegistry ¶
Auto-discovery registry for Tool subclasses.
Concrete Tool subclasses register themselves automatically via
__init_subclass__. The MCP server iterates this registry
to discover and expose available tools.
Usage::
for name, cls in ToolRegistry.get_all().items():
print(f"{name}: {cls.__doc__}")