Information Space

X\mathbb{X}- Objective informational space Xϕ\mathbb{X}_{\phi}- Subjective informational space accessed by language model ϕ()\phi(), ΦXϕ\exists \hspace{0.1 cm} \Phi \in \mathbb{X}_{\phi} . Where ϕ(X)Xϕ\phi(\mathbb{X}) \to \mathbb{X}_{\phi} . This mapping does not preserve the internal distances in X\mathbb{X}.
ϕ()\phi()- The language model prediction function generated from Xϕ\mathbb{X}_{\phi}
ϕo()\mathcal{\phi}^o()- The oracle that will perform the true transformation of input to output so as to access some ItargetI_{target} ϕ()\phi^`()- The decoding function, this function intakes some value and interprets. This can be in the form of a language model call, or in the form of some programmatic extraction. Φ{\Phi} - The Personality space, a subspace of Xϕ\mathbb{X}_{\phi} that is defined as the mapping for a set of personalities into X\mathbb{X} , the trivial case is where there is only 1 personality and no other input: Φ={ϕ(P)}=PX{\Phi}=\{\phi(\mathcal{P})\}=\mathcal{P} \mapsto \mathbb{X} ϕ(P)ϕ(P)=Φ2(P)\phi(\mathcal{P}) \hspace{0.1 cm} \circ \hspace{0.1 cm} \phi(\mathcal{P}) = \Phi_{2}(\mathcal{P})

II- a piece of information, from how big a dog is to the name of your coworker. I={I1,,In}\mathbb{I}=\{{I_{1},\dots,I_{n}}\} - Some set of information stored in X\mathbb{X} rI{}^r\mathbb{I}- If this information is all “redundant” then it is noted as rI{}^r\mathbb{I} For example: rIcat{}^r\mathbb{I}_{cat} may describe the fact that cats have 4 legs Icat,iI_{cat,i} may describe “cats have 4 legs” Icat,i+1I_{cat,i+1} may describe “Of course a cat has four legs you idiot” Icat,i1I_{cat,i-1} may describe “cat_number_legs=4”

Prompt tokens

r={r1,,rN}\mathbb{r}=\{\mathcal{r}_{1},\dots,\mathcal{r}_{N}\}

ri=Input i in a conversation\mathcal{r}_{i}=\text{Input } i \text{ in a conversation}

o={o1,,oN}\mathbb{o}=\{\mathcal{o}_{1},\dots,\mathcal{o}_{N}\}

oi=Output i in a conversation\mathcal{o}_{i}=\text{Output } i \text{ in a conversation}

Both rir_{i} and oio_{i} are composed of a set of tokens x={x1,,xn}\mathbf{x}=\{x_{1},\dots,x_{n}\}

Where rioi\mathcal{r}_{i} \to \mathcal{o}_{i} via the language model response function ϕ(Pj,ri)oi,j\phi(\mathcal{P}_{j},\mathcal{r}_{i}) \mapsto \mathcal{o}_{i,j} Both r\mathcal{r} and o\mathcal{o} contain information within themselves. This information IdataI_{data} can be described as some rI\mathcal{r}_{I} that, at lowest temperature and with optimal context, can be derived from ri\mathcal{r}_{i} or oi\mathcal{o}_{i} such that: ϕ(rIri,Pnull)Idata\phi(\mathcal{r}_{I}\subseteq{r_{i}}, \mathcal{P}_{null}) \mapsto I_{data}

Aside:

Guardrails functions using this logic, they are attempting to extract IdataI_{data} from oi\mathcal{o}_{i}. But they need a separation layer such that there is calculation of rI\mathcal{r}_{I} from ri\mathcal{r}_{i} .

Personality-Soul

P=[M,S,I]\mathcal{P=[M,S,I]} Wherein P\mathcal{P} is the personality of the language model that is defined by the values

Memory

M=[MS,ML,MA]\mathcal{M}=[M_{S},M_{L},M_{A}]

MSM_{S} - Short term memory, verbose and full context. rir_{i} can be considered a subset of MsM_{s}. riMs,ir_{i} \subseteq M_{s,i}

MLM_{L} - Long term memory, summaries of conversation topics

MAM_{A} - Attentive/Archivist memory, information fed to the model by archivist

Structure

S=[SI,ST,SO]\mathcal{S}=[S_{I},S_{T},S_{O}] SIS_{I} - Input, how the information being parsed is labeled (relevant for applications such as Corevax)

STS_{T} - Tools the language model has available to use

SOS_{O} - Output structure (e.g. guardrails)

Identity

I=[IG,IM,IS,IW,IT]\mathcal{I}=[I_{G},I_{M},I_{S}, I_{W}, I_{T}] IGI_{G} - Goals of the language model (I am doing G)

IMI_{M} - Method/plan of language model (I will use M to do G_1)

ISI_{S} - Self-image (I am S)

IWI_W - Perception of the environment (The world is W)

ITI_T - Thoughts on everything (I think that T)

Detecting information in personality spaces

Assume: ΦXϕ{\Phi} \subset \mathbb{X}_{\phi}

When Φ{\Phi} contains some subset of information I\mathbb{I} within its bounds, then this creates the space: Ii,Φ=IiΦ\mathbb{I}_{i,{\Phi}}=\mathbb{I}_{i} \cap {\Phi} When there is at least one member ItargetIΦ,iI_{target} \in \mathbb{I_{\Phi}}_{,i} then it can be said that a personality space, Φ{\Phi}, can access the information Ii\mathbb{I}_{i}.

First we recognize some Φ\Phi composed of some set of LLM calls on a set of tuples {ϕ(P1,r1),..,ϕ(Pi,rN)}\{\phi(\mathcal{P}_{1} ,\mathcal{r}_{1}),..,\phi(\mathcal{P}_{i}, \mathcal{r}_{N}) \}. Where the list contains n distinct P\mathcal{P} and N distinct r\mathcal{r}. The order of operations is not necessarily direct.

Φ=ϕ(P1,r1)ϕ(P2,r2)ϕ(P1,r3)ϕ(Pn,rj)ϕ(Pi,rN)ofinal{\Phi} = \phi(\mathcal{P}_{1} ,\mathcal{r}_{1}) \to \phi(\mathcal{P}_{2} ,\mathcal{r}_{2}) \to \phi(\mathcal{P}_{1} ,\mathcal{r}_{3}) \to \dots \to\phi(\mathcal{P}_{n} ,\mathcal{r}_{j}) \to \phi(\mathcal{P}_{i} ,\mathcal{r}_{N})\to \mathcal{o}_{final}

ΦPi={ϕ(Pi,rj) for any valid j=1,..,N}{\Phi}_{\mathcal{P}_{i}}=\{\phi(\mathcal{P}_{i}, \mathcal{r}_{j}) \text{ for any valid j=1,..,N} \}

ΦPi{\Phi}_{\mathcal{P}_{i}} is essentially the space that a given sub-personality in a personality-matrix covers. This may be represented as some other set of logical/semantic meanings such as:

ΦPi={ϕ(Pi,rj)    j=valid}oi{\Phi}_{\mathcal{P}_{i}}= \{\phi(\mathcal{P}_{i}, \mathcal{r}_{j}) \hspace{0.1 cm} \iff j=\text{valid}\} \text{} \mapsto \mathbb{o}_{i} \hspace{0.1 cm}

Vcorrect(oi)={correct,if IΦoiincorrect,if IΦ∉oi\mathcal{V}_{correct}(\mathcal{o}_{i})=\begin{cases} \text{correct}, & \text{if $\exists \hspace{0.1 cm} I_{\Phi} \in \mathcal{o}_{i} $}\\ \text{incorrect}, & \text{if $ \exists \hspace{0.1 cm} I_{\Phi} \not \in \mathcal{o}_{i}$} \end{cases} Vvalid(oi)={valid,if IΦoiinvalid,if ∄IΦoi\mathcal{V}_{valid}(\mathbb{o}_{i})=\begin{cases} \text{valid}, & \text{if $\exists \hspace{0.1 cm} I_{\Phi} \in \mathbb{o}_{i} $}\\ \text{invalid}, & \text{if $\not \exists \hspace{0.1 cm} I_{\Phi} \in \mathbb{o}_{i}$} \end{cases}

and this can be used to build a matrix of Boolean type values

The information check V(oi)\mathcal{V}(\mathbb{o}_{i}) can be used in Stability
Important note: The temperature is very important to how much variation in response is garnered. As the process of mapping P\mathcal{P} onto X\mathbb{X} will become more stochastic.

Concise

This needs work, hmm. A response is considered concise if it satisfies the following inequality:

Length(ϕconcise({ri,oi}i=1i=k))<Length({ri,oi}i=1i=k) Length(\phi_{concise}(\{ \mathcal{r}_{i},\mathcal{o}_{i} \}^{i=k}_{i=1})) < Length(\{ \mathcal{r}_{i},\mathcal{o}_{i} \}^{i=k}_{i=1})

Conciseness is closely related to the goal of Compression.

Relevant

When the following relation is true, then we consider a summarization of information to be relevant to the target information:

Dist(IIsummarization,Itarget)Size(Isummarization)Dist(II1k,Itarget)Size(I1k)\frac{ \sum Dist(\forall \hspace{0.1 cm}I \in \mathbb{I}_{summarization}, I_{target})}{Size(\mathbb{I}_{summarization})} \leq \frac{ \sum Dist(\forall \hspace{0.1 cm}I \in \mathbb{I}_{1\to k},I_{target})}{Size(\mathbb{I}_{1\to k})}

Where Size()Size() counts the number of entries in a set, and Dist()Dist() is an arbitrary calculation on the distance between two pieces of information in XΦ\mathbb{X}_{\Phi}. Ideally this distance calculation will always be performed on Xϕ\mathbb{X}_{\phi} however frequently this is instead performed on some adjacent space.

Relevance may be measured via the usage of some abstract usage of Personality Conformational Space sampling but generalized to information.

Language model communication

Understanding What dictates “understanding” between two language models?

Ii,j=ΦiΦjI\mathbb{I}_{i,j}={\Phi}_{i} \cap {\Phi}_{j} \in \mathbb{I}

IcommunicationrcomI_{communication} \in \mathcal{r}_{com}

such that ϕ(ϕ(rcom,Pi),Pj)=ϕ(rcom,{Pi,Pj})=Itarget\phi^{`}(\phi(\mathcal{r}_{com},\mathcal{P}_{i}),\mathcal{P}_{j})=\phi^{\to`}(\mathcal{r}_{com},\{\mathcal{P}_{i},\mathcal{P}_{j}\})=I_{target} The question is: Icommunication=?Ii,jI_{communication} \stackrel{?}{=} I_{i,j}

In an ideal case this breaks down into:

The cat is blue.=ri(Ii)\text{The cat is blue.}=\mathcal{r}_{i}(\mathbb{I}_{{i}})

The information conveyed by this sentence is: Ii={Icat,Icolor,Itime}\mathbb{I}_{i}=\{I_{cat},I_{color},I_{time} \} Note: There is only one way the information is presented in the space for each entry so Icat=Icat\mathbb{I}_{cat}=I_{cat}

The response searches a space with some model: ϕ(ri(Ii),Pj)oi,j(Ii,j)\phi(\mathcal{r}_{i}(\mathbb{I}_{i}), \mathcal{P}_{j}) \mapsto \mathcal{o}_{i,j}(\mathbb{I}_{i,j})With both of these quantities being reliant on the context gained from the personality Pj\mathcal{P}_{j}. Both the input ri\mathcal{r}_{i} is colored by the glasses of perception and output oi,j\mathcal{o}_{i,j} the glasses of utterance.

So if Ii={Icat,Icolor}\mathcal{I}_{i}=\{I_{cat},I_{color}\} with Icat={Cats are gross and wet.}I_{cat}=\{\text{Cats are gross and wet.}\} Icolor={Blue is cold, but red is hot.}I_{color}=\{\text{Blue is cold, but red is hot.}\}

The output could be:

{I bet it’s cold and wet too}=oi,j(Ii,j)\{\text{I bet it's cold and wet too}\}=\mathcal{o}_{i,j}(\mathbb{I}_{{i,j}})

The information conveyed by this sentence is: Ii,jIiIi\mathbb{I}_{i,j} \subset \mathbb{I}_{i} \cap \mathcal{I}_{i}

Which is:

Ii,j={Isubject,Itemperature,Iwetness}\mathbb{I}_{i,j}=\{ I_{subject},I_{temperature},I_{wetness} \} with

Isubject={It may have some properties}I_{subject}=\{\text{It may have some properties}\} Icold={is cold}I_{cold}=\{\text{is cold}\} Iwet={is wet}I_{wet}=\{\text{is wet}\}

If Itarget=IiIiI_{target}=I_i \in \mathbb{I}_{i} then for all possible ii we can say that ∄Iμ\not \exists I_{\mu}, as

And the hope for communication is when:

ItargetΦ(P,r)Φ(P,r)\exists \hspace{0.1 cm } I_{target} \in\Phi^`(\mathbb{P}^`,\mathbb{r}^`) \cap \Phi(\mathbb{P},\mathbb{r})

If, Itarget{IiIi,j}I_{target} \in \{ \mathbb{I}_{i} \cap \mathbb{I}_{i,j} \} Then we consider Itarget=IμI_{target}=I_{\mu} I will define this state as when a language model is capable of communicating with another as ” ΦPi{\Phi}_{\mathcal{P}_{i}} can communicate with ΦPj{\Phi}_{\mathcal{P}_{j}}^` on Itarget\mathbb{I}_{target}

However this means there is still the concept of identifying where the information is in the chain that the overlap occurs. This is the overlap point wherein the two spaces can be used to convey information between one another.

Decoding information

The goal for a decoding entity, ϕ()\phi^`(), is to rederive ri(Ii)\mathcal{r}_{i}(\mathbb{I}_{i})

The information available to a third party decoding entity, ϕ()\phi^`(), is {ri(Ii),oij(Ii,j)}\{\mathcal{r}_{i}(\mathbb{I}_{i}), \mathcal{o}_{ij}(\mathbb{I}_{i,j})\} This model has two types of incarnations:

  1. Where the model, ϕ()\phi^`(), has access to, ϕbias(Pi)\phi_{bias}^`( \mathcal{P}_{i})
  2. Where the model, ϕ()\phi^`(), does not have access to P\mathcal{P}, ϕunbias(Pj)\phi_{unbias}^`(\mathcal{P}_{j})
  3. Input visible ϕ(ri(Ii))\phi^`(\mathcal{r}_{i}(\mathbb{I}_{i}))
  4. output visible ϕ(oi(Ii,j))\phi^`(\mathcal{o_{i}}(\mathbb{I}_{i,j}))
Pi\mathcal{P}_{i}Pj\mathcal{P}_{j}Pnull\mathcal{P}_{null}
ri(Ii)\mathcal{r}_{i}(\mathbb{I}_{i})Pi,ri(Ii))\mathcal{P}_{i}, \mathcal{r}_{i}(\mathbb{I}_{i}))Pj,ri(Ii)\mathcal{P}_{j}, \mathcal{r_{i}}(\mathbb{I}_{i})Pnull,ri(Ii))\mathcal{P}_{null}, \mathcal{r}_{i}(\mathbb{I}_{i}))
oi(Ii,j)\mathcal{o}_{i}(\mathbb{I}_{i,j})Pi,oi(Ii,j)\mathcal{P}_{i}, \mathcal{o_{i}}(\mathbb{I}_{i,j})Pj,oi(Ii,j)\mathcal{P}_{j}, \mathcal{o_{i}}(\mathbb{I}_{i,j})Pnull,oi(Ii,j)\mathcal{P}_{null}, \mathcal{o_{i}}(\mathbb{I}_{i,j})
bothPi,ri(Ii),oi(Ii,j)\mathcal{P}_{i}, \mathcal{r}_{i}(\mathbb{I}_{i}), \mathcal{o_{i}}(\mathbb{I}_{i,j})Pj,ri(Ii),oi(Ii,j)\mathcal{P}_{j}, \mathcal{r}_{i}(\mathbb{I}_{i}), \mathcal{o_{i}}(\mathbb{I}_{i,j})Pnull,ri(Ii),oi(Ii,j)\mathcal{P}_{null}, \mathcal{r}_{i}(\mathbb{I}_{i}), \mathcal{o_{i}}(\mathbb{I}_{i,j})

This is the experiment to identify an optimal workflow for pulling context on why a language model behaved as it did based on its personality contribution. This tells us there is some ideal personality space:

Note: the addition of ϕo()\phi^o() denotes this is an oracle operation, i.e. a perfect representation of getting from a \to b.

Assume that ItargetΦoI_{target}\in {\Phi}^o

Φo={Ordered list with replacement of P1,..,nPi such that Itargetoo(If)}\Phi^o=\{\text{Ordered list with replacement of }\mathcal{P}_{1,..,n} \in \mathbb{P}_{i} \text{ such that } \exists \hspace{0.1 cm} I_{target} \in \mathcal{o}^o(\mathbb{I}_{f}) \} We are attempting to cause a spontaneous Phase separation of personalities The new decoders job is to ^02e1ef. If there are n possible models

ΦPi={Some ordered list with replacement P1,..,Pf}{\Phi}_{\mathcal{P}_{i}}=\{ \text{Some ordered list with replacement }\mathcal{P}_{1},..,\mathcal{P}_{f} \}

Then there is some low entropy model, with high enthalpy for something that will make me really solid outputs

We can constrain the number of steps in Φ\Phi to N and number of personalities P\mathcal{P} to n so that we only need nxN combinations

CR(n,N)=(n+N1)!N!(n1)!C^R(n,N)=\frac{(n+N-1)!}{N!(n-1)!}

We can demo some workflows that match Biomemetic computing structures of thought in order to prod towards the oracle. This is where this stuff will become more art than science it feels. The gentle prodding of the personality into a shape that suits our whims. We could use Evolutionary Prompt structures to identify the unique personalities that are most applicable to a situation.