LEV GOLDFARB: All right.
This title is not a frivolous.
So I'm going to suggest to you that we don't have formalisms
for talking about classes, and we, therefore, do not
understand very well what classes are.
So this is sort of related things.
So the first third of the talk will be about the sort of
formalisms that we have and why they are inadequate for
dealing with classes and, therefore, classification.
And two thirds of the talk will be about this new
formalism that we have been working for many, many years.
And this is just hot off the press even though this is the
But it's a much more satisfactory version.
So let's take a look-- we have to deal with very
basic issues here.
First, what is a numeric representation?
Well, this mapping did not really appear for a long time,
historically, but eventually we began to represent objects
Primitive tribes are still using knots and so on.
So the numbers actually should be thought of in this way, not
really as representations of things that be can eventually
be used as a representations.
But in any case, this is sort of a view of what is a numeric
Notice there's no concept of class here appears.
So numbers, we have natural numbers, there
are no classes there.
Natural numbers is no different, In fact, even if
you go to real numbers.
So this step is not a big step as far as the representation
of formalism is concerned.
I want to allow questions during the--
because then we can proceed at a reasonable pace.
Discussions we can do at the end of the talk.
So first of all, a little bit about the classes.
Well, if you look carefully at objects in the universe it
turns out there's no single object that exists outside the
environment of its class.
So these are tightly linked things.
Whatever we look at there's always associated classes.
There is no object that exists outside the class.
They are co-existent concepts.
So it's important to understand, then, that there
must be some tight link between objects and classes.
And that's sort of the business of induction, is the
reason biological information processing is so effective it
sort of relies--
it's a built-in mechanism.
And therefore it must exist in nature too.
So it's not our invention.
It must exist.
Let me propose the definition of what means class-oriented
We want to have a formalism that does the mapping of
objects, but that this mapping automatically induces the
mapping of the classes.
So the way we define classes in the representational
formalism should correspond to those classes.
So we should not invent, then, what the definition class Is?
It must come with a representational formalism.
So those two postulates.
And I'm going to discuss them in greater
detail in the next slides.
So what are the implications?
I'm saying nontrivial but it's not a particularly difficult
one, we can discuss later, that first of all, that the
class must be used by object mapping.
So it's not something we should be sweating a lot
about, introducing classes.
And they should automatically capture
the classes in reality.
So here is our representational formalism.
Once a define class here, it should capture exactly the
corresponding class that exists in reality.
And we will discuss later on why conventional
representation, certainly vector space representation do
not satisfy this property.
And also, what about the general
structure of this concept?
So we are not going yet into details but there's something
about general formal structure, what we want to
We would like to have this situation.
The class representation must be expressed with basic object
If we violate this we would violate the first basic axiom.
Now I'm just drawing your attention in this slide that
we don't have the concept of class representation.
And this is not really an accident, even though it's
quite unreasonable situation to do classification without
the concept of class representation.
But I'm saying that it's not accidental.
It just happens so that conventional formalism do not
really support an adequate concept of class.
We kind of have to add it post-hoc.
So you can think about that situation where to which
extent we have the concept of class representation.
Now, moving on to this sort of structure what we want to
have. We want it to be an
inductive generative structure.
So what does it mean?
Generative structure is similar to what Chomsky was
talking about grammars.
We want to be able to generate every object in the class, and
only those objects.
And it must be inductive.
Now inductive formal grammars don't possess that property.
Inductive, it means it must be effectively, reliably,
recoverable from a small training set.
That's what inductive is.
So this is just I'm reminding there was syntactic popular
area in '70s, '80s, inductive pattern recognition, where
they tried to adopt formal grammars to describing classes
But it sort of fizzled out because of the problems I will
discuss in a moment.
So what is it related to, the inadequacy of formal grammars?
Well Chomsky actually never believed, as I mentioned here,
he never believed in induction.
Because first of all, on the one hand he couldn't see
connection between formal grammars and induction because
there's no tight link between set of strings
and the formal grammars.
Sort of formal grammars is sort of god-given.
Somehow it appears with all the rules and
productions, and so on.
And the difficulty, of course, has to do with even the
underlying string representation.
They just don't carry enough information to allow this
recovery of class representation.
Well here's just a simple example.
If you take two strings because they don't have
formative history embedded in them as a part of their
representation you can't really distinguish between
those two strings, even though, if you go to a little
bit more precise
representation, this is not tree.
This captured the formative history.
You will see that they are different.
And this is what is indicated with green and red, that you
had aba and aca applied where the second a was the context.
While in the second case the first a was the context, which
is very different situation.
I'm going to discuss a little bit vector
space in several minutes.
Well here is a formal definition, just to remind you
what a vector space is, you know, these famous eight
axioms. But basically you have two operations.
Now what does that mean from applied point of view where
you define vector space in this way?
This is standard, formal definition
of this as a structure.
So we must assume that representational structure is
the algebraic structure.
There's simply no other candidate for
So we must assume that it is this
operations that act on object.
And accordingly, here I'm just discussing why this was
systematically overlooked at.
But accordingly, what are we allowed, then, if we assume
that these are the operations on the data and that's what
mathematical description of that formal structure is?
We must demand that the class representation is expressible
via these operations.
The only candidates, then, become affine subspaces, just
shifted this linear subspaces, because they have to be
consistent with underlying structure.
Now from applied point of view, this linear
generativity, obviously, is going to capture--
AUDIENCE: So why can't you represent a class as a
half-space, for example?
AUDIENCE: Could you repeat the question please?
AUDIENCE: I'm sorry.
Why can't you represent a class as a half-space if you
choose a vector space representing a half-space.
LEV GOLDFARB: They do not feed this standard [? mass. ?]
You want to have class representation, every element
in the class be generative.
In other words, the class description must be of
Every element must be expressible via basic
operations of class representation.
So if I give you a training set, and also here is
inductive, the second part comes in.
You want it to be recoverable uniquely from a finite
So you want to have this both things, generativity and this,
how would you define, then--
If I give you finite training set, and you would say the
class would be half-space, which
half-space would you choose?
AUDIENCE: I'd [UNINTELLIGIBLE] margin half-space.
Let's say the half-space that separates the training data.
LEV GOLDFARB: No.
Suppose I give you just one class, this training set from
How would you choose which class representation would be
reasonable for, if I give you whatever, 20 vectors?
AUDIENCE: [UNINTELLIGIBLE] of one class.
LEV GOLDFARB: How would you define because you want to
have those two properties, the generativity and everything
should be expressible via basic operation, this vector
addition and multiplication?
How would you define class elements?
How would you give me exact--
you have to generate every class element.
AUDIENCE: Why do you feel that this ability is
necessary for a--
LEV GOLDFARB: Well, because if you think about here is the
real classes and here is your formalism we want to have the
object mapping induces class mapping.
So if I have certain objects in the class I want to be able
to generate every object of this class only, and only this
object in the class using class representation.
I want to have this one-to-one mapping between real objects
and objects in my formalism.
AUDIENCE: So one thing that's been proposed for a one-class
classification is to find the half-space whose co-efficient
vector has the smallest [UNINTELLIGIBLE], or
something, that satisfies constraints of the--
enforce the fact that all members of your class are on
one side of the half-space, for example.
LEV GOLDFARB: But again, if you take, again--
let's go back to the real object in this mapping.
You remember this mapping from real
object into your formalism?
You want to ensure that after you do this that all object
that you can generate in your formalism, every object that
you can generate in your formalism, correspond to the
physical object that you are dealing with, with the class
you're dealing with.
I mean this mapping is very important because you are
mapping the actual object into objects in your formalism.
So you want to have a very tight link between the actual
object and the object that you have in your formalism.
So once you say it's a class representation you have to
then claim that every object that fall within this class
has a corresponding object in reality, which
will not be the case.
You want to have it a strong restriction.
It's a very strong restriction on what you want to have as a
representation of formalism.
So I guess depending on what you call a class, which I
guess is what we were talking about over lunch the would be
LEV GOLDFARB: Well, no, you don't want to call anything
You want to define class in such a way that this mapping
Because representational formalism is something that
you're going to use to represent the actual objects,
whether it's a web page, it doesn't matter.
You're going to use that representational formalism to
So once you are in your representational formalism now
you have to live here.
But that means you cannot invent the concept of class
that is not meaningful over here.
So once you map the object you're done.
You are not allowed--
you don't have any other means--
to change what the class of actual objects over here on
the left-hand side.
Because you're dealing with real objects.
You just representing them in this formalism on the
So once you map them that's it, you have no control.
So the whole point is that once you do this mapping,
realize you want it to be automatically induced the
mapping of this class onto this class.
It's a strong requirement but what I am suggesting that this
is sort of a necessary requirement.
You see, if you do not satisfy that condition you will be
defining classes here that has no reality over here, and
that's a problem.
AUDIENCE: It just seems to me that it's impossible to define
LEV GOLDFARB: You will see now.
That's what I am going been talking about.
This is most of the talk will be about.
AUDIENCE: But then the question might be, are there
things that we would call classes that are not captured
by this formalism as well?
If you define a class to be things that are captured by
LEV GOLDFARB: But you will see.
No you will see--
formalism is universally applicable because this
formalism all object can be encoded, therefore, it's very
easy to check if there are real classes that--
once you have that mapping and you use this formalism, then,
if you can find classes here that are not classes in your
formalism then you're done.
Then you know it's not good.
But you will see it is not like that.
So anyway, what I am suggesting that to compensate
for above paucity of, not classes, but this classes
according to that definition, and in violation of sort of
wisdom in mathematics.
One brings alien class representation and, therefore,
So you bring things like nonlinear functions and other
things to define class.
They have nothing to do with the structure of this vector
space, which is algebraic.
This is a representational structure.
That's the whole point, that if you want to treat objects
this way, if you want to take your representational
formalism seriously, you must take the operations on objects
So this is a typical picture.
Here is your class, and you have some nonlinear decision
Now everybody got used to this.
It took me 10 years to ask this kind of question because
I also was used to these things, assuming that that's
OK, you know, this is how things look like.
But the question is how meaningful and informative
such class description, if you call this classes?
What do we learn when I draw these curves
in the vector space?
How much did I learn about anything real because this
class is going to respond to reality?
How much am I learning about reality when I am drawing
these kind of curves in a vector space or surfaces?
What am I learning?
Well, it's useful to ask this question because you want to
be able to say that when you've gotten class
representation that you learn a lot of things about reality
because that's sort of the purpose of induction.
And this is where the hidden induction--
this is what philosophers thought.
That we should be able, when we say we learn something,
class, we should get wiser.
What are we learning about actual objects
when we draw this?
But anyway, let me move on.
So there is no tight link between the training set and
the class representation.
And that's a big problem because we want to have a much
tighter link so it's things are uniquely deducible, the
So what are we learning during the
learning process, basically?
So the problem is not with learning algorithms, that's
what I am saying.
It is a much deeper underlying problem with the
representational formalism itself because the structure
of this representational formalism is such that you
cannot remedy this deficiency to begin once you're there.
So I'm saying the more intransient in machine
learning is, again, these distances, kernels, and so on,
they don't change.
I'm just listing again, it doesn't change the basic
situation that we have been discussing.
You still don't have meaningful class a
representation, and that's a big problem.
So we've got to move-- as hard as it appears--
we've got to move on.
And the benefits are tremendous because once you
have that kind of formalism you are really doing--
it's not just important for information processing,
classifying web pages, but you also, from point of view of
when you have these protein data banks you have a good
description of these classes to meaningful description,
which it would be then very illuminating to biologists.
It won't be just something that one uses only for some
strange auxiliary purposes.
And of course as far as the search engines are concerned,
well, imagine if you have that formalism you'll represent
classes and you use it as a basis for
organizing search engine.
And then the query will be either a class element or
description of a class, two things are possible.
So the user will be a certain interface, either will
specified class element or will specify
description of class.
And he will get, as a result of query, the class itself.
Well, here I am discussing where the origins of this
[? word ?] go back to 1990.
It's a long paper published in Pattern Recognition.
This is the website for the latest version.
It's a 70-page paper but it's well illustrated.
And its just 70 pages, I think we removed all the
Just the basic concept there, it takes 70 pages, but there
are a lot of illustrations.
There's a formal decision, here I'm just going to give
you informal, just intuitive feeling for what's going on.
So as a scientific view we would like to view the reality
as a multitude of interacting and evolving classes of
Let's say let's view reality as this.
In fact, that's what it is, any bottle, any desk.
This actually is a process, and you will see next.
If you think about a chromosome, because it has
genes, this is a poor representation.
A string is a poor-- it doesn't correspond because, in
reality, it's a dynamic process.
This is how it was created, and this is how it gets
translated, and so on.
So there is a process involved and string doesn't capture
this kind of reality.
AUDIENCE: But a string captures a more abstract
concept because different chromosomes could share the
same DNA sequence, right?
[INTERPOSING VOICES] a chromosome in different
LEV GOLDFARB: Yes, but you remember what I mentioned
about strings earlier.
That if I give you--
even never mind the sort of more pragmatic point of view,
but just doing learning--
if I give you finite set of strings why there are
infinitely many classes of strings currently, according
to the current concept, that contain this finite set?
That's not a palatable situation.
You don't want to have that situation.
Again because a string does not carry within itself
sufficient information for doing this adequate recovery.
AUDIENCE: But I think there are reasonable principles to
use to choose from among the infinitely many concepts that
are consistent with a finite set,
LEV GOLDFARB: They would be somewhat ad hoc principles
that would simply have to be put in in order to deal with
that big problem.
AUDIENCE: And there's a trade-off between how ad hoc
or principled a solution is and how useful it is.
LEV GOLDFARB: But the whole point, you see, of having a
good formalism for classification you don't want
to introduce this kind of assumptions.
You want to get into representation and you want to
move on with it and doing easily the job
that you need to do.
You don't want to introduce later on all kinds of ad hoc
AUDIENCE: It depend on what you want to do, I suppose.
LEV GOLDFARB: No, I mean, in general that's the idea of
If you look back at these first two axioms, once you
represent the objects you want your
classes to emerge naturally.
You don't want to do any hocus pocus.
You want them to be sitting there.
AUDIENCE: But it depends, right?
If somebody has an engineering objective to build an
interface to a database where scientists can put their
sequence and it tells them something about the likely
function, let's say, of a tissue sample, or
something like that.
The representation depends on the DNA--
LEV GOLDFARB: I'm sorry, what are the classes here?
What would be the classes?
AUDIENCE: The class might be proteins with a given
function, I suppose.
LEV GOLDFARB: Proteins with a given function.
So that's precisely fit into this concept of class.
AUDIENCE: But a representation that depends on the DNA coding
for the protein might be a more convenient and effective
way to design an engineering [INTERPOSING VOICES].
LEV GOLDFARB: Not necessarily, If it is known that there are
proteins that have very similar DNA structure or RNA
structure but they have different function, quite
AUDIENCE: But the existence of those cases doesn't
necessarily mean that it isn't an effective engineering tool.
LEV GOLDFARB: No, it's an indication that you cannot
grasp effectively class of these.
The whole point is you want to have a reliable grasp of the
class that you are dealing with.
That is the purpose of classification.
You said yourself, you want to be able to say that this class
of proteins have very similar functions.
That's the purpose, I assume, you told me that would be the
definition of that class.
LEV GOLDFARB: Right.
So if you want this, then you automatically want--
if I give you a small training sample, 5, 10, 15 proteins,
you want to be able later on reliably say whether other
proteins belongs to that class or not.
And more importantly, you won't be able to answer that
question unless your class representation contain some
nontrivial information related to the function of this
protein in some form.
AUDIENCE: Yeah, like the DNA sequence.
LEV GOLDFARB: No, but DNA sequence
AUDIENCE: Yeah, but he's talking about-- you're talking
about folding a protein.
The same sequence can fold in different ways under the
influence of other things in the system, which really ties
into his argument where the temporal process is the
folding and there are a lot of other elements involved in the
protein folding beside just the DNA sequence.
AUDIENCE: [INTERPOSING VOICES] chromosomes.
LEV GOLDFARB: No.
But this is again--
AUDIENCE: [INTERPOSING VOICES]
The chromosomes are the source code that describes the DNA
sequence, but the sequence has to be executed in the rest of
the system that causes the folding to occur that
eventually produces the behavior of the protein.
The behavior of the protein is really just determined by how
this folded and not just by the sequence itself.
AUDIENCE: But that fact doesn't necessarily imply that
one must design induction engines that are going to
reason about classes of proteins using representations
of proteins that involve all the details of the
evolutionary history of the protein and--
LEV GOLDFARB: No, no, no.
We are not talking about this yet.
AUDIENCE: It's not evolutionary history, it's
just an individual--
AUDIENCE: Process [INTERPOSING VOICES]
which protein involved.
LEV GOLDFARB: No, no.
We're not we're not talking yet, I think this is what you
added yourself right now.
I haven't suggested that you have to plug in entire
information about protein.
I'm only telling you that your representation of formalism
should be reliable enough that it will capture the class in
the right way with this function.
And the string does not have enough information.
AUDIENCE: It depends.
It depends on the class that you're trying to describe.
LEV GOLDFARB: Well you gave me an example of a class, right?
So I'm assuming we're discussing that class.
AUDIENCE: There are some classes that can be described
very well by their DNA sequence.
LEV GOLDFARB: Well I don't know of any classes, except
very trivial ones, that would be described by DNA sequence.
AUDIENCE: A chionesis sequence, I suppose.
So chionesis have a protein sequence that highly conserved
the code spore region of the protein that attaches to a
particular type of molecule.
LEV GOLDFARB: Right.
But for example, I can give you a sequence very close to
the ones that you think belong to that class that will not be
doing this at all.
AUDIENCE: But for the engineering purpose of
classifying real proteins the scientists really encounter--
LEV GOLDFARB: No, no.
When you say real, but this would be real sequence because
I can manufacture you something that will have
exactly that sequence but will not behave in that way.
AUDIENCE: But the inability of my system to handle that
artificially created sequence--
LEV GOLDFARB: It's not artificially created, this is
a real thing.
I'm talking about create a real--
AUDIENCE: I mean that those type of circumstances will
arise infrequently enough that my engineering tool will,
nevertheless, be useful.
LEV GOLDFARB: Why are you so sure about this?
AUDIENCE: A tool of that type is being used as we speak.
LEV GOLDFARB: No, no.
It is being used because we don't have other tools.
It is not being used because it is superior to the tools I
want to discuss because these tools have not been around
yet, the ones that I want to discuss.
AUDIENCE: I guess what I'm arguing for is a possibility
that abstract representations, which leave out some aspects
of the items that we wish to classify can, nevertheless, be
That's the only proposition--
LEV GOLDFARB: Yes.
But this is general enough statement.
What I want to discuss is I want to discuss, today, is
what is important to have in the representational formalism
that will give you a reliable way to deal with
specification, not that is what is possible to do.
Of course, if you are poor in the representational means,
yes. but why not, if you have a more powerful
representational means, why not to go for this because you
will have a much more precise picture everything?
AUDIENCE: But sometimes that structure can be powerful
because it can save you unnecessary effort on sorting
through the vast complexity--
LEV GOLDFARB: OK.
You will see, as we go down, you will see why it is true.
There is a certain hypothesis you will see coming up.
I just beginning to discuss this temporal process that is
First, I just want to remind you there were a few Nobel
Prizes even awarded in the last 15 years in the
evolutionary developmental biology.
If it look at the body plan, it turns out that all animals
have roughly, what, 500 million years ago there is a
regulatory genes that organize during the
development the body plan.
And you go from mammals, and you go to flies, and they're
Of course, there were some modifications.
So I'll come back to this.
I just want you to keep in mind this sort of regular
proteins that organize things.
So what is a body plan?
You can think about each segment as a class.
And this class is being modified evolutionarily.
But think of each body segment also as a class, and you build
larger classes out of the smaller body segments.
But we will come back to this one.
Well, what about other data?
Let's look at web pages.
Web pages should not be treated as static.
So what we are talking about is not whether you can
represent-- you can always represent something
in a very poor way.
We want to understand what means a reliable
representation for the purposes of classification.
What is a good representation for the purposes of
That's what we haven't discussed.
We are not discussing whether--
we can always go into poor
representations for various reasons.
But to do classification, that's what we are discussing,
which kind of representation.
So the web page, we are interacting with the web page,
so it is not a static object.
If you think about it, even the way we represent web page
we are sort of interacting, and here is the time element,
so there is a process involved.
So I'm preparing you towards objects as processes view.
This is what will be done.
So if you're talking about classes of objects, we are
talking about now about classes of processes.
So each object kind of will be-- you will see now, I
haven't defined this formally yet, it's just
something like that.
So what is the basic formal units here?
They are called primitives, more fully, primitive
This circle, now, this is an abstract primitive.
What it defines is this.
Each of these figures on the top correspond to a class of
processes that are coming in.
Then they interact and something, other classes, are
being produced, sometimes similar.
So this is the unit of representation.
This is atomic units of representation.
And what do they capture?
They capture a set of events because everything now
processes, so we use events to represent things.
Events understood as interaction of classes.
So this is a concrete primitive.
Unfortunately the figures are the same but now you're
talking about concrete process that belongs to the class, so
you can think of concrete object.
So those [? c1, ?]
1 and cj2, and so on, these are the concrete processes in
our concrete elements of the class.
So this is a concrete primitive.
And this is a label, if you see that.
What is the label?
Label is a way, because we are not going to be writing those
processes on the top, so this label b capture the sequence
And well, when you observe it, your sequence of events, it
just happened that this process turn into the terminal
process, so-called, for one primitive became an initial
process for another primitive.
Now primitives are events, they're just encapsulating
events that you observe during the interaction, whatever.
So just some very light examples.
Car collision, OK?
You have an event, right?
Two processes come in and two processes come out.
Two cars were driving, this is the two
processes, two objects.
And this sentence.
Take a sentence, John fell in love with Susan, OK?
Well if I tell you that John fall in love with Susan, as
far as you're concerned, you're representation is just
an event to you.
That two processes, one of the way of modeling it.
Two initials and two terminals, because you change
your perception of those two people, so
something happened to them.
It's not the same John and it's not the same Susan,
something was added.
This event added something to this.
But you continue, John is still John, but it's a
This is what is shown.
So Alice and Bob had a baby.
Again, having a baby, two initials three terminals
because you knew Alice and Bob, now, there are three of
them, and so you have three processes.
Let's go more scientific example, so to
speak, is into physics.
This was inspired by Feynman diagrams but
they are less precise.
They are capturing this situation less--
you will see now.
You will see several examples related to this.
So you have primal processes, this are the processes that
come in and come out.
They're called primal processes.
You see, you don't know at the basic level, you don't know
yet the structure of the process.
So the structure is suppressed and you just see lines.
That's all what you see.
Later on, you will see that this structure will become
available for transform once we learn something.
So this is the primal processes and this is the
events, this is the basic events.
Now let's take a look at hydrogen atom.
Well, I guess, no.
Before this we still have to take a look at that view.
This is just an encapsulation how to think, how to translate
everything into this formalism in ETS.
So here you observe state one, you observe something abc.
Then state two, you observe d and c, and what you saw that a
and b merged to form d.
Here is an event.
So normally, conventionally, you would represent them as
sequence of this state.
Here what you are representing is this.
That's a representation of the situation where time
is going this way.
So time become a very important factor.
Time and structural representation now become
First of all, you cannot go back.
Once you're here, that's it.
Time is going down, you cannot back, so
everything going one way.
It's irreversible processes.
So by the way, this is what we will call formative history,
and I'm going to be using often this term regularly.
Hopefully we will get to the meaning of that.
So biology gives a very good example of science to suggest
how to think about objects, I think, for our purposes, for
purposes of classes and classification.
So what is a struct in this model?
Well, here is a struct sequence of events.
Some of the events, of course, you didn't see they were not
So that's OK.
They are not temporally order.
This is still representation of the process, oa, a small
segment of the process.
By the way, this is nothing to do with graphs, now, because
of the temporal thing.
It was inspired, of course, by natural numbers because,
according to [UNINTELLIGIBLE]
axioms, this is how the numbers are built.
Have a success sort of operation, and numbers are
built that way.
Now with numbers you can collapse them.
You don't need this initially primitive [? tribes ?]
stored actually this sort of [? not. ?]
But then they started to write three, so the temporal part
sort of disappeared.
But if you look at this, you can see now you cannot
collapse it because each one has a structure.
Each event now has a structure.
and you keep it as part of the representation.
So this is the fundamental distinction because you cannot
collapse it because each event is possibly structurally
different from the other and you have a temporal sequence.
But it was inspired by numbers, of course, this
Well here's is a hydrogen atom.
You have one proton and one electron.
And you can see the sequence of events that correspond.
This is, you can think of it as a hydrogen process.
They're going on this interactions.
So that's a representation would be of hydrogen process.
By the way, this is a structural representation.
In contrast to Feynman diagrams,
this was just drawings.
This is interpreted as a formal structural
representation because this events is a formal entities.
This is lithium process, slightly more complex.
And you can see when
[? periodicity ?], because, of course--
yes, I have to move on faster--
So struct relabeling.
There is a formal concept of struct relabeling.
So when you relabel the struct it's just a mapping so each
struct, each class has a set of labels associated with it.
So you can take all the labels of one class and map it.
So you allowed relabeling that preserve class belonging.
But what will happen it we will reveal-- see, here are
the two structs.
One is obtained simply by relabeling the other one.
But now you can see that they share something which you
couldn't see before, you see here.
So relabeling plays an important role even though
it's a simple operation.
AUDIENCE: So what are you classifying as an entire
LEV GOLDFARB: That's right.
That's an object now.
Everything becomes a object.
That's at a representation of an object.
AUDIENCE: How do you separate out
objects because one could--
LEV GOLDFARB: No.
This is a representation would be of a single object.
It's a process.
AUDIENCE: But one could imagine one giant object, the
universe, or something, as one humongous process
LEV GOLDFARB: No, no.
There's nothing to classify.
If you take the whole universe,
there's nothing to classify.
You're talking about now object, conventional web page,
whatever you want to deal with classes, that would be
represented like this.
AUDIENCE: But how do you decide what's part of one
object and not?
LEV GOLDFARB: No, no.
You are modeling that object and this process.
So you choose you primitives, you chose it, and then you
encode it in this format.
AUDIENCE: So the same real life objects could be modeled
LEV GOLDFARB: They could be modeled differently but the
whole point here is, as example from physics suggests,
that you want to choose events that would be most appropriate
for modeling a like chunk of the domain you
want to deal with.
It is true.
Initially it will not be easy to do.
So you would choose certain events but eventually you
If you want model certain class of web pages set
reasonably wide you will fix the set of events which you
will find reliably, and you will use that set of events.
It's interaction events in terms of the user.
For example, you have chemistry.
In chemistry you deal with molecules.
And molecules consist of data, atoms. How do you decide in a
particular situation do we want to view this as
collection of molecules or collection of atoms?
LEV GOLDFARB: You will see in this formalism, it allows the
stages of representation.
So you will see, in fact in this talk, how you--
when the lithium and the hydrogen will interact and
form lithium hydride, you will see how it
can become a primitive.
Because this is a natural, no special tricks are necessary.
It's just a part of the
representational formalism now.
So structurally identical structs you notice that the
labels are different here but their structure is the same.
So you can think of it, when you strip them of labels, the
real object, this is abstract struct, you can call it.
Well, there is a struct assembly because here they are
depicted separately, but if you will notice that they
share some of the primitive.
So when you put them together this is
called struct assembly.
What it means is that you observe them separately but,
of course, some primitive, for example this primitive, is
shared, of course, by this struct and buy that struct.
So when you just picture them together that's
called struct assembly.
So in other words, when I am observing a face or a web page
I may have noticed only part of it, and I
represent it as a part.
But later on I looked at another part of the web page
and then I will put a picture of it.
But then eventually I have to put things together because
I'm looking at different parts.
AUDIENCE: Lev, I think someone will come to get the room in a
few minutes, so you probably want to move
on to the main points.
LEV GOLDFARB: All right.
So here is an example of how to model Bubble Sort here.
Here you can do both model architect.
Architecture is static for the array, and
you have a data flow.
So there are primitives--
this just a simple example because it's
an array, very simple.
But it could illustrate some points.
So here is the model, here is one comparison how it would be
modeled, a single comparison.
Now we will see this is a single internal loop.
So what you do you bubble four,
bubble up to its location.
This is how representation look.
The architecture doesn't change.
This is just says this marks the boundary of [? cell ?] of
But you can have a dynamic architecture, it would be more
complex than picture.
just a bit slow--
so here is sorting of this 4, 2, 3, 1.
So you see here is the struct
corresponding to actual process.
So 3:00 someone may come in?
I believe there's the service being used
AUDIENCE: I also had another talk at 3:00 I have to go to.
We've got about four more minutes before I have to start
LEV GOLDFARB: OK.
So here is an example of two structs.
Now here is the class description.
Class description, you need constraints.
Class is defined with constraints, and here is how
you generate elements of the class.
You allow environment to make a move, so to speak,
environment is whatever the process is that participate.
The white lines represent the events that you
[UNINTELLIGIBLE] that becomes class element.
So here is the picture.
Here is your class element.
So class representation consists of a struct plus this
generating system that actually build class elements,
and is specified via those constraints
So here is the same example with web
pages, you know, just--
Here is the simulation.
This is constraints for Bubble Sort, and you will see now
simulation of what happens, how you generate because this
is a class description.
So this is sort of description.
This is how that struct is being generated.
And, of course, this class description could be used for
construction of any struck that belongs to so-called
Now, it turns out that there are levels.
You see, if you take a struct, and if you learn some classes,
the whole thing could be partitioned then, and you have
a more information.
So you could call it a level one struct because it contains
this partitioning based on the classes that you have learned.
So there is a formal description how to build next
level classes because there are constraints, now,
associated how they interact class elements.
So still the same kind of language.
And here is a class element construction.
Now you use larger chunks because now you are not using
single primitive you're using class elements to build this
level 1 class.
The same sort of a picture, but you're building out with
So this is just some example how you can build.
So here, you see this is a level 1 class but
it's built out of--
here is a description of classes of previous level.
This is one element of a level 1 class.
And just very similar here to what you have here.
Here is a transformations.
Here is a process, something happens here.
Now at this level, you can see the processes.
Before the processes were not visible.
So this is how you go to the next level, it's is
Now here is an example with this lithium hydride.
This is what you have seen, but you can shrink it and it
becomes primitive, and you can use it for
description of other.
So that's the last slide.
This is the encapsulation of what picture.
You have various level classes here, then you move to the
This is when this transformation appear, when
you see this interactions, larger chunks, and so on.
This is how this representation--
So the basic idea is this, that I don't think we can do
classification without adequate
representation of formalism.
In spite of the fact that there are so many people
engaged in classification I think that it has to do with
the right representational formalism that will have an
important impact, a major impact on this activity.