This is a formalisation I am using to help clarify an intuition I have about the structure of agents, which I refer to as entities here. Each entity is represented as a structure similar to a highly connected branched chain copolymer, where physical and conceptual “particles” make up the monomeric units. Entities naturally compute due to their fluctuations in physical and conceptual space resulting in them folding and unfolding into different configurations.

Entity and Particle Spaces

An entity exists as a dynamic configuration of causally linked nodes (particles), residing in a high-dimensional state space . This space aims to incorporate dimensions relevant to the entity’s state, such as physical coordinates () and conceptual dimensions ().

Where represents physical particles and represents conceptual particles.

In a language model (LM), from Notation for LM Formalization, the LM is an entity , and is described as follows:

  • The objective information space is the underlying space of all information
  • The LM’s subjective information space (access by the LM’s inference function "") corresponds to the primary conceptual subspace for , (i.e. ).
  • The LM’s personality represents a configuration of conceptual nodes within . The personality space is the mapping of in this space for all possible inputs .
  • The LM’s physical substrate (hardware, computation, interface) constitute , with a signalling particle that generates output when a receptor particle receives input

The state of each node can be described probabilistically using a state distribution function , where represents a point in state space at time .

We can use can be thought of as the probability density associated with node ‘s state being at location at time . While the complex form, , is useful for handling interference phenomena, a real-valued probability distribution, , should be sufficient for many applications.

Entities are bounded systems where causal links between particles exceed a threshold :

Where is the causal (bond) strength between particles and . Importantly this is scale variant, so if one is dealing with a nation , meaning that the requirement for causal linkage of semantic/physical particles is lower for people to be considered of a nation than to be considered of a family, or an individual.

Particle detection and interaction

Building on (objective information space) and (subjective information space) from Notation for LM Formalization.

Objective physical subspace , and subjective physical subspace accessed via inference process :

With a conceptual equivalent:

While represents the total subjective information space accessible to the entity/model via its overall inference process , specific inference processes (like or ) might only operate on or access further subsets or depending on their function. This creates a hierarchical structure of accessible spaces.

Inference is the process by which a series of interactions between particles experiences causal procession, “if X then Y”. There exists 4 specific forms:

Physical to semantic, “If feel X, then think Y”, sensory input alters belief state:

Semantic to physical, “If think X, then feel Y”, belief triggers physiological response:

Semantic to semantic, “If think X, then think Y”, one thought leads to another:

Physical to physical, “If feel X, then feel Y”, muscle contraction cascade:

Where:

  • represents an inference process (e.g., reasoning, body awareness, emotional processing)
  • Different entities may employ different inference functions
  • Each accesses some portion of the objective spaces and

For example:

  • A body scan meditation () would primarily access physical particles:
  • Logical reasoning () would primarily access semantic particles:
  • Emotional processing () might access both:

This allows us to model how different cognitive and physical processes can operate on the same underlying particle space but through different inferential lenses.

There are both homogenous particle interactions such as a calcium gradient causing a muscle to tense ( ), and heterogeneous interactions like a smell reminding someone of a fear and causing them to tense physically ().

Attention as Network Analysis

Attention is a measure of connectedness for a particle , as a function of bond strength between physical particles and semantic particles:

Where is a function weighting the contribution of individual bond strengths B (e.g., sums all strengths, or a threshold function like if , otherwise, counts strong bonds).

Changes in attention represent changes in the connectedness of the network, and can be used as a method of detecting the underlying properties of bonds.

That said, if one does not granularly consider every discrete physical interaction that carriers information, then one can also calculate based also on homogenous bonds (P-P, S-S), although these ultimately rely on physical mediation.

The underlying bond strengths , reflected in the overall attention profile , determine how effectively particles and interconnected particle structures influence each other’s states and positions in their respective spaces. The interaction is not necessarily reciprocal, pain as a concept does not change nearly as much as the physical manifestation of the body does when exposed to it.

Practically one can think of how one may focus on their arm, connecting it to abstract notions of softness, comfort, and pain. In this framework, when sensory inputs (like touch) or cognitive relevance signals (like task demands) trigger attention to the arm, the underlying particle dynamics respond by rearranging in physical and semantic spaces. This rearrangement effectively decreases the distance between particles (e.g., by physically touching a soft object) or optimizes the relative orientation to expose high-affinity patches (e.g., by contemplating the concept of softness), between the arm’s physical particle and the semantic sensation particle . These dynamic changes in and parameters increase and contribute to a higher attention measure . When external triggers diminish or cognitive priorities shift, these dynamic parameters naturally evolve, causing to increase and/or to decrease between the relevant particles, leading to weaker and lower . That is until a needle prick provides a strong sensory signal that sharply decreases through automatic physiological responses, causing a spike in which increases . Subsequent inference processes trigger actions or thought patterns that dynamically influence the particle configuration to increase or decrease relative to the pain particle, reducing and thus .

Particle Wave-Field Properties

Building on the previously defined state distribution function for nodes, we can examine how particles exist as probability distributions rather than discrete points. For clarity:

  • For physical particles :

  • For semantic particles :

Where:

  • represents complex numbers, enabling interference patterns between particles
  • represents non-negative real numbers (time domain)
  • gives the probability density of finding particle at position at time
  • The phase component represents the particle’s affinity potential which is related to :

These wavefunctions form coherent structures through:

  1. Localization: Sharp peaks in probability density represent discrete beliefs or physical states, such as an opinion on who to vote for or a sleeping position.

Where is approximately a delta function centered at

  1. Delocalization: Probability density spreads across related concepts/states, describing how when one smells something it can trigger a memory or how thinking about cookies can bring to mind more general categories of baked goods.

Where:

  • are related semantic/physical states
  • are complex coefficients representing:
    • : Probability of activating state when the delocalized structure interacts strongly (e.g., when its component particles x exhibit high )
    • : Phase alignment with other states, related to the affinity function:
    • is directly influenced by bond strength:

Delocalization creates distributed semantic structures like “vehicle” encompassing multiple related concepts (car, bicycle, boat) with varying activation strengths. When interactions lead to increased bond strengths involving this structure (reflected in high for its components), component concepts are activated proportionally to .

  1. Coherent Structures: Stable arrangements of multiple particles, such as how believing in a christian God forms a stable structure with belief in the Bible’s teachings due to reciprocal constructive interference, resonance.

Where represents how individual particle wavefunctions combine

The complex-valued representation allows for:

  • Interference — When multiple beliefs/concepts constructively or destructively interact
  • Resonance — When particles with matching phases form sustained constructive interference.

Stable Bonds

Particles form bonds of varying strengths defined by their distance and affinity, creating causally linked structures analogous to protein folding:

Where:

  • is bond strength with units of energy, and represents the work required to separate nodes integrated out to . The units depend on the subspace like physical energy (Joules) in , or computational cost (operations/time) in
  • is distance in appropriate space (e.g. Euclidean in , embedding distance in )
  • represents the intrinsic tendency for nodes to link together (water and wetness would have high ). This is modulated by state/orientation (phase ) into an effective affinity that determines the interaction strength. (eg. needle flat vs point). Measurable via joint computational cost or inferred from evaluation of network structure.

Scale-Dependent Phase Coherence

Phase coherence between particles decays with distance , and is dependent on entity scale on the scale of the entity:

Where:

  • is the phase coherence factor between particles and
  • is a scale-dependent attenuation coefficient: . Although may also depend on factors beyond scale such as communication channels or environment.
  • is the distance between particles

The effective phase relationship between particles becomes:

This ensures phase coherence is maintained within entity boundaries but decays across boundaries according to scale. To measure this factor , we could attempt to find statistical correlations between belief activations at variouos scales (eg. belief alignment within families vs. nations)

Boundary Formation

Boundaries are manifested in two complementary ways:

  1. Probability Density Gradients: Sharp drops in forming “edges” in physical or semantic space

  2. Phase Discontinuities: Regions where phase coherence breaks down between particles

Charisma and Entity Relationships

Charisma () is defined here as the ability of one entity () to influence another () by modulating the distances () and/or affinities () between particles within ‘s network. The goal of affinity is to change attention for a locus .

This manipulation of and alters bond strengths () and consequently changes the target’s attention profile (the set of nodal attention values ). While the mechanism involves and , the effect is often measured or observed as a change in this attention profile:

With three forms:

  1. Positive Charisma (): Influences particle distances and affinities to increase bond strengths toward some coordinate/particle, effectively saying “pay attention to this.”

Where represents the resulting gradient of change in the attention profile for particles near location , caused by charisma’s underlying influence on and .

  1. Negative Charisma (): Influences particle distances and affinities to decrease bond strengths away from some coordinate/particle, effectively saying “ignore this.”

  2. Null Charisma (): Minimizes changes to particle distances and affinities, resulting in minimal change to the target’s attention profile.

Applications of Charisma

During a prompted interaction, one entity (the influencer) provides input to another entity (the target). The charisma mechanism works by crafting to induce specific changes in the distance () and affinity () parameters within ‘s particle network.

These / changes alter bond strengths throughout ‘s network, which in turn reshapes the attention profile . This reconfiguration of bond strengths and attention determines the output produced by ‘s inference process .

When aims to elicit a specific target output from , it must solve the charisma inference problem: identifying which input will induce the necessary / changes to maximize . We can express this as:

Where represents the charisma inference that predicts how ‘s personality and inference method will respond to various inputs. This process typically requires iterative testing, which is difficult in systems with memory as each interaction may further alter ‘s internal / parameters.

In practice, the goal isn’t always to produce an exact output , but rather to ensure can extract some target information from ‘s output:

A simplified case is an LLM without memory and with deterministic responses ( ). Here, one can map the “output landscape” by systematically varying inputs and observing how changes in affect the resulting / parameters (as reflected in the output), eventually constructing an approximation of . Otherwise known as prompt engineering/optimization.

Conclusion

This framework provides tools for analyzing entities as systems of physical () and semantic () nodes linked by bonds () determined by distance () and affinity (). Key aspects include:

  1. Modeling entities across scales with scale-dependent properties ().
  2. Representing beliefs/concepts using wave-like probability distributions () allowing for uncertainty, interference, and phase-dependent interactions.
  3. Classifying information (, , , ) based on effects on transmission () and internal work ().
  4. Modeling influence (charisma ) as modulation of internal network parameters (, ) affecting attention ().
  5. Explicit integration with LLM formalism (Notation for LM Formalization) treating LLMs as entities operating within objective and subjective information spaces, with Personality () structuring their semantic subspace ().

While providing expressive power, there is need to operationalise and describe bond strengths (computational cost), transmission probability (), defining benefit/harm scoring functions, and validating the wave analogies empirically.

Work to empirically test and validate this framework should focus on:

  • Measuring phase coherence between beliefs within entities of various scales to test the scale-dependent coherence factor, .
  • Quantifying LM charisma based on the ability to induce desired internal states (tracked via the attention profile or other proxies) by manipulating inputs that affect internal and .

The major limitation remains the ability to appropriately define metrics for semantic-physical interactions and spaces. Which, in the case of LMs, is much simpler as there are only input and output physical nodes that need be considered.

Connection to The Care and Feeding of Mythological Intelligences

This essay covers different forms of intelligence that have arisen in modern times.

  1. Angels (Deterministic Processes) exhibit highly localized particle distributions with rigid bond structures:

Where each represents a precise rule or computation. Angels operate primarily in semantic space with high phase coherence and predictable interaction patterns, making them efficient for well-defined tasks but brittle when encountering novel situations.

  1. Daemons (Statistical Processes) display partially delocalized distributions with probabilistic bond structures:

Where are distributions centered at optimization points . Daemons exhibit gradient-following behavior, with particle density flowing toward reward maxima. Their influence on networks operates by modulating and parameters to optimize bond strengths toward reward-maximizing configurations.

  1. Faes (Distributional Processes) manifest as broadly delocalized probability distributions:

Where represents semantic patterns. Faes operate through superposition of probability waves across semantic space, with particles that readily form and dissolve bonds based on pattern-completion dynamics. They influence networks by modulating and to reinforce pattern recognition, resulting in changes to attention profiles that highlight related semantic structures.

  1. Yokai (Complex Systems) emerge from interactions between the other types, with multi-scale boundary structures:

Yokai exhibit emergent properties through heterogeneous particle interactions across scale boundaries, creating entity structures with varying degrees of coherence and stability. They influence networks by modulating and across multiple scales simultaneously, creating complex patterns of bond strengths that manifest as hierarchical attention structures.

The meme-antimeme formalism directly relates to how these intelligences propagate information:

  • Angels transmit memes with high fidelity but limited adaptability

  • Daemons propagate memes that optimize specific objectives

  • Faes generate memes that pattern-match to existing semantic structures

  • Yokai create complex meme ecosystems with emergent properties

Similarly, the charisma functions (, , ) map to how each intelligence influences networks:

  • Angels influence particle networks through precise / modifications based on explicit instruction

  • Daemons modulate / parameters to optimize for specific objectives

  • Faes influence / through pattern-based resonance

  • Yokai modulate / across multiple scales simultaneously, resulting in complex attention profile changes

Attentions relationship to beliefs

This relates to the activation function from Evolution of Alignment and Values, where the activation patterns represent the graph of connected beliefs:

This activation probability is influenced by the specific bond strength and contributes to the overall attention measure of the belief system.

Where is a belief (particle subgraph) and is a query (stimulus), with being the method of “inference” over a particle graph that produces a detectable alignment (response), . The goal being that one is able to probe the memberships of beliefs in a personality, ^e84635, that completes inference according to some architecture (All my human context in an LLM would not recreate my next thought/idea).

This activation probability is the likelihood that the belief subgraph significantly influences the model’s output in response to query . This activation depends on the bond strengths between the query stimulus and the constituent particles within the subgraph . High activation typically correlates with, and contributes to, elevated attention measures for the particle comprising the belief subgraph . This is modelled as Detecting information in personality spaces

Information Classification

Formalizing The Anatomy of Information, the fourfold classification of information is:

  1. Meme (): Information that increases transmission probability between specific entities, where is transmission probability, is baseline transmission probability.

    This can be grounded as the channel capacity and mutual information between entities. A key challenge is operationalizing rigorously, especially for LLM communication involving the interaction of inference functions, personalities, and interpretations.

  2. Antimeme (): Information that decreases transmission probability between specific entities, this can be grounded in the concept of negative transfer entropy.

  3. Infoblessing (): Information that reduces the work required for an entity to reach beneficial configurations or increase the work required to reach harmful ones

    Where represents the work required for entity to transition to causal configuration of particles . Work encompasses metabolic energy, computational cost, and socio-psychological cost/benefit. Importantly, defining “beneficial” and “harmful” configurations requires entity-specific scoring functions. This can be grounded as the Kullback-Leibler divergence for beneficial configurations or as increasing path complexity towards harmful configurations.

  4. Infohazard (): Information that increases the work required to reach beneficial configurations or decreases work to reach harmful ones. This can be grounded as increasing the path complexity towards beneficial configurations, while decreasing KL divergence for harmful configurations.

Note that these classifications are often graded rather than binary and are highly context and entity-pair dependent.

Information Classification Matrix

Meme ()Antimeme ()Neither /Both /
Infoblessing ()Viral life hacksTherapy about embarrassing topics, how to handle a shameful eventPersonal epiphanies, individual insights that improve one’s lifeComplex moral frameworks
Infohazard ()Chain letters, dangerous viral challenges, harmful rumorsYour parents’ weird sex tape, traumatic knowledge that is dangerous to shareChildhood trauma (generic)Roko’s Basilisk
Neither or Funny cat videos, “E”Private insignificant secrets, forgotten triviaOrdinary mundane informationAcademic jargon on a niche subject
Both and “mug cake” recipes (easy but unhealthy)Personal growth through shameful experiencesChildhood trauma (makes you funny)The game of mao, where drug dealers hang out

Basilisks and Information Extraction

This system can describe Newcomb’s Basilisk Defined, in a formal form. Basilisks represent a special case of information structures that extract work from entities through prediction-based incentives.

Memes () connect to basilisks through the affinity function which measures entity ‘s alignment with basilisk . A meme increases , making entities more likely to perform work extracted by the basilisk: .

Antimemes () can function as “anti-basilisks” that immunize against prediction manipulation by reducing confidence in the estimator’s accuracy: where is the predictor accuracy and is the reward ratio, as referenced in ^cf0da3.

In the particle-bond model, basilisks operate by creating specific configurations of particles that:

  1. Increase the probability of transmission between entities (meme property)
  2. Alter the work required to reach certain configurations (infohazard/infoblessing property)
  3. Modulate the distance () and affinity () parameters through targeted charisma ()

This connects to considerations about building alternative basilisks, as referenced in ^f401b1, where the strategic goal becomes maximizing the likelihood that any hostile entity, should it exist, will believe you were working within its incentive structure.