Transcript
EwueqdgIvq4 • Jeffrey Shainline: Neuromorphic Computing and Optoelectronic Intelligence | Lex Fridman Podcast #225
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0560_EwueqdgIvq4.txt
Kind: captions
Language: en
the following is a conversation with
jeff shaneline a scientist at nist
interested in opto electronic
intelligence
we have a deep technical dive into
computing hardware that will make jim
keller proud i urge you to hop on to
this rollercoaster ride through
neuromorphic computing and
superconducting electronics and hold on
for dear life
jeff is a great communicator of
technical information and so it was
truly a pleasure to talk to him about
some physics and engineering
to support this podcast please check out
our sponsors in the description
this is the lex friedman podcast and
here is my conversation with jeff
shaneline i got a chance to read a
fascinating paper you um authored called
optoelectronic intelligence
so maybe we could start by talking about
this paper and start with the basic
questions what is optoelectronic
intelligence
yeah so in that paper the the concept i
was trying to describe is
sort of an architecture for building
brain-inspired computing
that leverages light for communication
in conjunction with
electronic circuits for computation
in that particular paper a lot of the
work we're doing right now in our
project at nist is focused on
superconducting electronics for
computation i'll go into
why that is but
that might make a little more sense in
context if we first
describe what that is in contrast to
which is semiconducting electronics
so is it worth taking a couple minutes
to describe
semiconducting electronics
it might even be worthwhile to step back
and uh talk about electricity and
circuits and how circuits work right
before we talk about super conductivity
right okay
how does the computer work jeff well i
won't go into everything that makes a
computer work but
let's talk about the
basic building blocks a transistor so
and even more basic than that a
semiconductor material silicon say so
uh in silicon silicon is a semiconductor
and what that means is at low
temperature there are no free charges no
free electrons that can move around so
when you talk about electricity you're
talking about
predominantly electrons moving to
establish electrical currents and they
move under the influence of voltages so
you apply voltages
electrons move around those can be
measured as currents and you can
represent information in that way so
semiconductors are special
in the sense that
they are really malleable so if you have
a semiconductor material
it you can change the number of free
electrons that can move around by
putting different
elements different atoms in lattice
sites so
what is a lattice site well a
semiconductor is a crystal which means
all the atoms that comprise the material
are
at exact locations that are perfectly
periodic in space so if you start at any
one atom and you go along the what are
called the lattice vectors you get to
another atom and another atom and
another atom and for high quality
devices it's important that it's a a
perfect crystal with very few defects
but you can intentionally replace a
silicon atom with say a phosphorus atom
and then you can you can change the
number of free electrons that are in a
region of space that has that excess of
what are called dopants so picture a
device that has a left terminal and a
right terminal
and if you apply a voltage between those
two you can cause electrical current to
flow between them
now we
add a third terminal up on top there and
depending on the voltage between the
left and right terminal and that third
voltage you can you can change that
current so what's commonly done in
digital electronic circuits is to
leave a fixed voltage from left to right
and then
change that voltage that's applied at
what's called the gate the gate of the
transistor so
what you do is you you make it to where
there's an excess of electrons on the
left excess of electrons on the right
and very few electrons in the middle and
you do this by changing the
concentration of different dopants in
the lattice spatially
and then when you apply a voltage to
that gate you can either cause current
to flow or turn it off and so that's
sort of your zero and one you if you
apply voltage current can flow that
current is representing a digital one
and uh from that from that basic element
you can build up
all the complexity of digital electronic
circuits that
have
really had a profound influence on our
society now you're talking about
electrons can you give a sense of what
scale we're talking about when we're
talking about in silicon
uh being able to mass manufacture these
kinds of uh gates
yeah so scale in a number of different
senses well at the scale of the silicon
lattice the distance between two atoms
there is half a nanometer so
um
people often like to compare these
things to the the width of a human hair
i think it's some
six orders of magnitude smaller than the
width of a human hair
uh something on that order
so remarkably small we're talking about
individual atoms here and electrons are
of that length scale when they're in
that environment
but there's another sense that scale
matters in digital electronics this is
perhaps the more important sense
although they're related scale refers to
a number of things it refers to
the size of that transistor so for
example i said you have a left contact a
right contact and some space between
them where the
the gate electrode sits that that's
called the the channel width uh or the
channel length and um
what has enabled what we think of as
moore's law or the continued
increased performance in silicon
microelectronic circuits is the ability
to make that size that feature size ever
smaller ever smaller at a
a
really remarkable pace i mean
that that feature size has
decreased uh consistently every couple
of years for
the since the 1960s and that was
that was what moore predicted in the
1960s he thought it would continue for
at least two more decades and it's been
much longer than that and so
um
that is why we've been able to fit ever
more devices ever more transistors ever
more computational power on essentially
the same size of chip
so a user sits back and does essentially
nothing you're running the same computer
program but those devices are getting
smaller so they get faster they get more
energy efficient and all of our
computing performance just continues to
improve and we don't have to
think too hard about
what we're what we're doing as say a
software design or something like that i
absolutely don't mean to say that
there's no innovation in
software that are the user side of
things of course there is but
from from the hardware perspective we
just have been given this gift of
continued performance improvement
through this scaling that is ever
smaller feature sizes with very similar
um say power consumption that power
consumption is
has not continued to scale in the most
recent decades but um
nevertheless we had a really good run
there for a while and now we're down to
gates that are seven nanometers which is
state of the art right now maybe
global foundries is trying to push it
even lower than that i can't keep up
with
where the predictions are that it's
going to end but seven nanometer
uh
seven nanometer transistor
has just just a few tens of atoms along
the length of the conduction pathway so
a naive
semiconductor device physicist would
think you can't go much further than
that without
some kind of revolution in the way we
think about the physics of our devices
is there something to be said about
the mass manufacture of these devices
right right so that's another thing so
how have we been able to make those
transistors smaller and smaller well
companies like intel global foundries
they invest a lot of money in the
lithography so how are these
chips actually made well one of the most
important steps is this what's called
ion implantation so you have you start
with sort of a pristine silicon crystal
and then using photolithography which is
a technique where you can pattern
different shapes using light
you can define which regions of space
you're going to implant with different
uh different species of ions that are
going to change the local electrical
properties right there
so by using ever shorter wavelengths of
light and different kinds of optical
techniques and different kinds of
lithographic techniques things that go
far beyond
my knowledge base
you can just simply shrink that feature
size down and you say you're at seven
nanometers well the wavelength of light
that's being used is over 100 nanometers
that's already deep in the uv so
how how are those minut features
patterned well there's there's an
extraordinary amount of innovation that
has gone into that but nevertheless it
stayed very consistent in this
ever-shrinking feature size and now the
question is can you make it smaller and
even if you do do you still continue to
get performance improvements but that's
another kind of scaling where
these these companies have been able to
so okay you you picture a chip that has
a processor on it well that chip is not
made as a chip it's made as a on a wafer
and um
using photolithography you basically
print the same pattern
on different dies all across the wafer
multiple layers tens probably
probably a hundred some layers in a
mature foundry process and and you do
this on ever bigger wafers too that's
another aspect of scaling that's
occurred in the last several decades so
now you have this 300 millimeter wafer
it's like as big as a pizza and it has
maybe a thousand processors on it and
then you dice that up using a saw and
now you can sell these things
so cheap because the the manufacturing
process was so streamlined i think a
technology as revolutionary as silicon
microelectronics
has to have that kind of
manufacturing scalability which i will
just emphasize i believe is
enabled by
physics it's not i mean that of course
there's human ingenuity that goes into
it but at least from my
my side where where i sit it sure looks
like the physics of our universe allows
us to to produce that and we've we've
discovered how
more so than we've invented it although
of course we have invented it humans
have invented it but it was it's almost
as if it was there waiting for us to to
discover it you mean the entirety of it
or are you specifically talking about
the techniques of photo lithography like
the optics involved i mean the entirety
of the scaling down to the seven
nanometers that you're able to have
electrons
not interfere with each other in such a
way that
you could still have gates like that's
enabled to achieve that scale spatial
and temporal
seems to be very special and is enabled
by the physics of our world all the
things you just said so starting with
the the silicon material itself
silicon is a unique semiconductor it has
essentially ideal properties for making
a specific kind of transistor that's
extraordinarily useful so
i mentioned that
silicon has uh well when you make a
transistor you have this gate contact
that sits on top of the conduction
channel and depending on the voltage you
apply there you pull more carriers into
the conduction channel or push them away
so it becomes more or less conductive
in order to have that work without just
sucking those carriers right into that
contact you need a very thin insulator
and and part of scaling has been to
gradually decrease the thickness of that
of that gate insulator so that you can
use a roughly similar voltage and still
have the same
current voltage characteristics so the
material that's used to do that or i
should say was initially used to do that
was a silicon dioxide which just
naturally grows on the silicon surface
so you expose silicon to
the atmosphere that we breathe and
uh well if you're manufacturing you're
going to
purify these gases but nevertheless that
that what's called a native oxide will
grow there there are essentially no
other materials on the entire periodic
table that have as good of a
gate insulator as as that silicon
dioxide and that that has to do with
nothing but the physics of the
interaction between silicon and oxygen
and if it wasn't that way transistors
could not
they they could not perform in
nearly the the degree of capability that
they have and that that has to do with
the way that the the oxide grows the
reduced density of defects there it's
it's insulation meaning essentially it's
energy gaps you can apply a very large
voltage there without having current
leak through it so that's physics right
there
um
there are other things too silicon is a
semiconductor in in an elemental sense
you you only need silicon atoms a lot of
other semiconductors you need two
different kinds of atoms like a compound
from group three and a compound from
group five that opens you up to lots of
defects that can occur where one atom's
not sitting quite at the lattice site it
is and it's switched with another one
that degrades performance
but then also on the side that you
mentioned with the the manufacturing
we have access to light sources that can
produce these very short wavelengths of
light
how does photolithography occur well you
actually put this polymer on top of your
wafer
and you expose it to light and then you
use a
aqueous chemical processing to dissolve
away the regions that were exposed to
light and leave the regions that were
not
and
we are blessed with these polymers that
have the right property where
they can um
cause scission events where the polymer
splits where a photon hits i mean you
know maybe maybe that's not too
surprising but i don't know it all it
all comes together to have this
really complex uh manufacturable
ecosystem where
very sophisticated technologies can be
devised
and it works quite well
and amazingly like you said with a
wavelength at like 100 nanometers or
something like that you're still able to
achieve on this polymer precision of
whatever whatever we said seven
nanometers yeah i think i've heard like
four nanometers being talked about
something like that yes
i if we could just pause on this and
we'll return to super connectivity but
in this whole journey from a history
perspective what what do you think is
the most beautiful
at the intersection of engineering and
physics to you and this whole process
that we talked about with silicon and
photolithography
things that people were able to achieve
in order to uh
push
the moore's law forward is it the early
days the the invention of the transistor
itself is it uh some particular cool
little thing that um
maybe not many people know about like
what do you think is most beautiful in
this
in this whole process journey
the most beautiful is a little difficult
to answer let me let me try and sidestep
it a little bit and just say
what strikes me about looking at the the
history of
silicon microelectronics is that
uh so when when quantum mechanics was
developed people quickly began applying
it to semiconductors and it was
broadly understood that these are
fascinating systems and people cared
about them for their basic physics but
also
their utility is devices and then the
transistor was invented in the late 40s
in a
relatively crude experimental setup
where you just crammed a metal electrode
into the semiconductor and and that was
that was ingenious these people were
able to um
make it work you know uh
but so
what what i want to get to that that
really strikes me is that in those early
days
there were a number of different
semiconductors that were being
considered they had different properties
different strengths different weaknesses
most people thought germanium was the
the way to go
it it had some some nice properties uh
related to things
about how the electrons move inside the
lattice
but other people thought that compound
semiconductors with group 3 and group 5
also had really really extraordinary um
properties that might be conducive to to
making the best devices so there were
different groups exploring each of these
and that's great that's how science
works you have to cast a broad net but
then
what i what i find striking is why why
is it that silicon
won because it's not that it's not that
germanium is a useless material and it's
not present in technology or compound
semiconductors they're both
doing
doing exciting and important things
slightly more niche applications whereas
silicon is the semiconductor material
for microelectronics which is the
platform for digital computing which has
transformed our world why did silicon
win it's because of a remarkable
assemblage of qualities
that no one of them was the clear winner
but it it made these sort of compromises
between a number of different influences
it had that really excellent
gate oxide that allowed it to that
allowed us to make mosfets these high
performance transistors
so quickly and cheaply and easily
without having to do a lot of materials
development the the band gap of silicon
um
is actually so in a semiconductor
there's there's an important parameter
which is called the band gap which tells
you uh if you they're they're sort of
electrons that fill up to one level in
in the energy
diagram and then there's a gap where
electrons aren't allowed to have an
energy in a certain range and then
there's another energy level above that
and that that difference between the
lower sort of filled level and the
unoccupied level that tells you how much
voltage you have to apply in order to
induce a current to flow
so with germanium that's about 0.75
electron volts that means you have to
apply 0.75 volts to to get a current
moving
and it turns out that
if you compare that to the
the thermal excitations that are induced
just by the temperature of our
environment that gap's not quite big
enough you start to
use it to perform computations it gets a
little hot and you get all these
accidental carriers that are excited
into the the conduction band and it
causes errors in your computation
silicon's band gap is just a little
higher 1.1
electron volts but you have an
exponential dependence on the
the number of carriers that are present
that can induce those errors
uh it decays exponentially with that
voltage so just that that slight extra
energy in that band gap
really puts it in an ideal position to
be operated
in the in the conditions of our of our
ambient environment it's kind of
fascinating that so like you mentioned
air is um decrease exponentially
uh with the voltage so
it's funny because this error thing
comes up you know when you start talking
about quantum computing
it's kind of amazing that everything
we've been talking about the errors as
we scale down seems to be extremely low
yes and like all of our computation
is based on the assumption that it's
extremely low yes so it's not digital
computation digital sorry digital
computation so as opposed to our
biological computation our brain is like
the assumption is stuff is gonna fail
all over the place and we somehow have
to still be robust to that that's
exactly right so this also this is gonna
be the most controversial part of our
conversation where you're gonna make
some enemies so let me ask because we've
been talking about physics and
engineering
a which group of people is smarter and
more important for this one
let me ask the question in a better way
some of the big innovations some of the
beautiful things that we've been talking
about how much of it is physics how much
of it is engineering my dad is a
physicist and he talks down to all the
amazing engineering that we're doing
in
the artificial intelligence and the
computer science and the robotics and
all that space so we argue about this
all the time so what do you think who
gets more credit i'm genuinely not
trying to just be politically correct
here i don't see how you would have
any of the
what we consider sort of the great
accomplishments of society without both
and you absolutely need both of those
things physics tends to play a key role
earlier in the development and then
engineering optimization these things
take over
and uh i mean
the invention of the transistor or
actually even before that the
understanding of semiconductor physics
that allowed the invention of the
transistor that's all physics so if you
didn't have that physics you don't even
get to get on the on the on the field
but
once you have
understood and demonstrated that this is
in principle possible
moore's law is engineering that
why we have uh
computers more powerful than
than old supercomputers in each of our
phones is that's all engineering and
i i think i would
be quite foolish to say that
that's
i mean that that's not valuable if it's
not a great contribution uh it's a
beautiful dance would you put like
silicon
the understanding of the material
properties
in the space of engineering like how
does that whole process work to
understand that it has all these nice
properties or even the development of
photolithography
is is that basically
would you put that in the category of
engineering
no i would say that
it is basic physics it is applied
physics it's material science it's um
x-ray crystallography it's polymer
chemistry it's it's everything i mean
chemistry even is thrown in there
absolutely yes yes absolutely just no
biology
okay we can get to biology right well
the biology is in the humans that are
engineering the system that's all
integrated deeply okay so let's return
you mentioned this uh word
superconductivity
so what does that have to do
with what we're talking about right okay
so in a semiconductor as i
tried to describe a second ago
you can sort of uh
in induce currents by applying voltages
and those have sort of typical
properties that you would expect from
some kind of a conductor those electrons
they don't just flow
perfectly without dissipation if an
electron collides with an imperfection
in the lattice or another electron it's
going to slow down it's going to lose
its momentum so you have to keep
applying that voltage in order to keep
the current flowing in a superconductor
something different happens if you get a
current to start flowing it will
continue to flow indefinitely there's
there's no dissipation so
that's crazy how does that happen well
it happens at low temperature and this
is crucial it has to it has to be a
quite low temperature and
what what i'm talking about there i
for essentially all of our conversation
i'm going to be talking about
conventional superconductors um
sometimes called low tc superconductors
low critical temperature superconductors
and so
those materials have to be
in at a temperature around say around 4
kelvin i mean their critical temperature
might be 10 kelvin something like that
but you want to operate them at around 4
kelvin 4 degrees above absolute zero and
what happens at
that temperature at that very low
temperatures in certain materials is
that the the noise of
atoms moving around the lattice
vibrating electrons colliding with each
other that becomes sufficiently low that
the electrons can settle into this very
special state it's sometimes referred to
as a macroscopic quantum state because
if i had a piece of
superconducting material here let's say
niobium is a very typical
um
superconductor if i if i had a block of
niobium here and we cooled it below its
critical temperature
all of the electrons in that in that
superconducting state would be in one
coherent quantum state they would
the the wave function of that state
is described in terms of all of the
particles simultaneously but it extends
across macroscopic dimensions the size
of a whatever material the size of
whatever
block of that material i have sitting
here and the way that the way this
occurs is that
you know we let's try to be a little bit
light on the technical details but
essentially the electrons coordinate
with each other they they are able to
in this macroscopic quantum state
they're able to sort of
one can quickly take the place of the
other you can't tell electrons apart
they're they're what's known as
identical particles so if this electron
runs into a
defect that would otherwise cause it to
scatter
it can just sort of
um
almost miraculously avoid that defect
because it's not really in that location
it's part of a macroscopic quantum state
and the entire quantum state was not
scattered by that defect so you can get
a current that flows without dissipation
and that's called a supercurrent
that's
uh
sort of just very much scratching the
surface of of superconductivity there's
very deep and rich physics there which
is probably not
the main subject we need to go into
right now but it turns out that when you
have
this material you can you can do usual
things like make wires out of it so you
can get current to flow in a straight
line on a chip but you can also make
other
devices that perform
different kinds of operations some of
them are kind of logic operations like
you like you'd get in a transistor the
most common or most um
i would say
diverse in its utility the component is
a joseph's injunction it's not analogous
to a transistor in the sense that if you
apply a voltage here it changes how much
current flows from left to right but it
is analogous in sort of a sense of
it's the it's the go-to component that a
that a circuit engineer is going to use
to start to build up more complexity so
these are uh these junctions serve as
gates
they can they can serve as gates they
can
so i'm not sure how house
um concerned to be with semantics but
let me just briefly say what a joseph's
injunction is and we can talk about
different ways that they can be used
basically if you have a superconducting
wire and then a small
gap of
a different material that's not
superconducting an insulator or normal
metal and then another superconducting
wire on the other side that's a joseph's
injunction so it's sometimes referred to
as a superconducting weak link so you
have this
superconducting state on one side and on
the other side and that the
superconducting wave function
actually tunnels across that gap and
when you when you create such a physical
entity it has very unusual
um
current voltage
characteristics within in that gap
like like weird stuff through the entire
circuit so you can imagine suppose you
had a loop set up that had one of those
weak links in in the loop
current would flow in that loop
independent even if you hadn't applied a
voltage to it and that's called the
josephson effect so the fact that
there's this
phase difference in the quantum wave
function from one side of the tunneling
barrier to the other induces current to
flow so how does you change state right
exactly so how do you change state now
picture
if i have a
current bias coming down this line in my
circuit and there's a joseph's
injunction right in in the middle of it
and now i make another wire that goes
around the joseph's injunction so i have
a loop here a superconducting loop
i can add
current to that loop by exceeding the
critical current of that joseph's
injunction so
like any superconducting material
it can carry this supercurrent that i've
described this current that can
propagate without dissipation up to a
certain level and if you try and pass
more current than that through the
material it's going to become a
resistive material a normal normal
material so in the in the joseph's
injunction the same thing happens i can
bias it above its critical current and
then what it's going to do it's going to
add a
quantized amount of current into that
loop and what i mean by quantized is
it's going to come in discrete packets
with a well-defined value of current so
in the
vernacular of of some people working in
this community
you would say
you pop a flux on into the loop so a
flux on you pop a flux on into the loop
yeah so if that's a skateboarder
sorry go ahead
a flux on is one of these quantized
uh sort of uh amounts of current that
you can add to a loop and this is a
cartoon picture but i think it's
sufficient for our purposes so which uh
maybe it's useful to say
what is the speed at which these
discrete packets of current travel
because we'll be talking about light a
little bit it seems like the speed is
important the speed is important that's
an excellent question
sometimes i wonder where you
how you became so astute
but um so
this uh matrix four is coming out so
maybe that's related i'm not sure i'm
dressed for the job
i was trying to get to become an extra
matrix for it didn't work out
anyway uh so what's the speed of these
packets you'll have to find another gig
i know i'm sorry um so the speed of the
pack is actually these flux ons these
these uh sort of pulses of of um
current that are generated by joseph's
injunctions they can actually propagate
very close to the speed of light uh
maybe something like a third of the
speed of light that's quite fast so
one of the reasons why joseph's
injunctions are appealing is because
their signals can propagate quite fast
and they can they can also switch very
fast what i mean by switch is perform
that operation that i described where
you add current to the loop
that can happen within um
a few tens of picoseconds so you can get
you can get devices that operate in the
hundreds of gigahertz range and by
comparison
most processors
in our in our conventional computers
operate closer to the the one gigahertz
range maybe three gigahertz seems to be
kind of
where where those speeds have have
leveled out so the gamers listening to
this are getting really excited that
overclock their system to like what is
it like four gigahertz or something 100
this sounds incredible uh can i just as
a tiny tangent is the
physics of this understood well how to
do this stably oh yes the physics is
understood well the physics of joseph's
injunctions is understood well the
technology's understood quite well too
the reasons why
it hasn't displaced silicon
microelectronics in conventional digital
computing
i think are more related to what i was
alluding to before about the the
myriad practical almost mundane aspects
of silicon that make it so useful
you can make a transistor ever smaller
and smaller
and it will still perform its digital
function quite well the same is not true
of a joseph's injunction you really they
don't they just it's not the same thing
that there's this feature that you can
keep making smaller and smaller and
it'll keep performing the same
operations this loop i described any
joseph's in circuit well i i'm going to
be careful i shouldn't say any joseph's
in circuit but many josephs and circuits
the way they process information or the
way they perform whatever function it is
they're trying to do maybe it's sensing
a weak magnetic field
it it depends on an interplay between
the junction and that loop and you can't
make that loop much smaller and it's not
for practical reasons that have to do
with lithography it's for fundamental
physical reasons about the way
the magnetic field interacts with that
superconducting material there's there
are physical limits that no matter how
good our technology got
those circuits would i think would never
be able to be scaled down to the the
densities that silicon microelectronics
can i don't know if we mentioned is
there something interesting about the
various superconducting materials
involved or is it all there's a lot of
stuff that's interesting and it's not
silicon it's not silicon no so like it's
some materials that also required to be
super cold for calvin yes so so let's
dissect a couple of those different
things the super cold part let me just
mention for your gamers out there that
are trying to clock it at four gigahertz
and would love to go to what kind of
cooling system can achieve exactly four
kelvin you need liquid helium and so
liquid helium is expensive it's
inconvenient you need a cryostat that
that sits there and
the energy consumption of that cryostat
is impracticable for it's not going in
your cell phone you're not so you can
picture holding your cell phone like
this and then something the size of you
know uh
a keg of beer or something on your back
to cool it like that makes no sense yeah
so if you if you're trying to make this
in consumer devices uh electronics that
are ubiquitous across society
superconductors are not in the race for
that for now but you're saying so we're
just to frame the conversation maybe the
thing we're focused on is computing
systems that serve as like as servers
like large yes large systems so so then
you can contrast what's going on in your
cell phone with what's going on at
one of the super computers
um
colleague katie schuman invited us out
to oak ridge a few years ago so we got
to see titan and that was when they were
building summits so these are some high
performance supercomputers
out in tennessee and those are filling
entire rooms the size of warehouses you
know so once you're at that level okay
there you're already putting a lot of
power into cooling you need
cooling is part of your engineering task
that you have to deal with so there it's
not entirely obvious that cooling to 4
kelvin is out of the question it's it
has not happened yet and i can speak to
why that is in the digital domain if
you're interested
i think it's not going to happen i don't
think
i don't think superconductors are going
to replace semiconductors
for digital computation
um there are there are a lot of reasons
for that but i think ultimately what it
comes down to is all things considered
cooling
errors
scaling down to feature sizes all that
stuff semiconductors work better at the
system level is there some aspect of uh
just
curious about the historical momentum of
this is there some power to the momentum
of an industry that's mass manufacturing
using a certain material is this is like
a titanic shifting like what's your
sense when a good idea comes along how
good does that idea need to be for the
titanic to start shifting
that's a that's an excellent question
that's an excellent way to to frame it
and you know
i don't know the answer to that but what
i think is okay so the the history of
the superconducting logic
goes back to the 70s ibm made a big push
to do superconducting digital computing
in the 70s and they made some choices
about their
devices and their architectures and
things that
in hindsight were kind of doomed to fail
and i don't mean any disrespect for the
people that did it it was hard to see at
the time but then another generation of
superconducting logic was introduced
i want to say the 90s
someone named likarev and seminov they
propose an entire family of circuits
based on joseph's injunctions that are
doing digital computing based on logic
gates and or
not these kinds of things
um
and they showed how it could go hundreds
of times faster than silicon
microelectronics and it was it's
extremely exciting i wasn't working in
the field at that time but later when i
went back and read the literature i was
just like wow this is this is so awesome
uh and so it you might think well
the reason why it didn't displace
silicon is because silicon already had
so much momentum at that time
but that was the 90s silicon kept that
momentum because it had the simple way
to keep getting better you just make
features smaller and smaller so
you know it would have to be
i don't think it would have to be that
much better than silicon to displace it
but the problem is it's just not better
than silicon it might be better than
silicon in one metric speed of a
switching operation or power consumption
of a switching operation
but building a digital computer is a lot
more than just that elemental operation
it's
everything that goes into it including
the manufacturing including the
packaging including the
um the you know various materials
aspects of things so
the reason why and even in even in some
of those early papers i can't remember
which one it was licorice said something
along the lines of
you can see how we could build an entire
family of digital electronic circuits
based on these components they could go
100 or more times faster than
semiconductor
logic gates
but i don't think that's the right way
to use superconducting electronic
circuits he didn't say what the right
way was but he basically said
digital logic trying to
steal the show from silicon is probably
not what these circuits are are most
suited to accomplish so if we can just
linger and use the word computation
when you talk about computation how do
you think about it do you think purely
on just um the the switching
or do you think something a little bit
larger scale a circuit taken together
performing the basic arithmetic
operations that are then required to do
the kind of
computation that makes up a computer
because when we talk about the speed of
computation is it boiled down to the
basic switching or is there some bigger
picture that you're thinking about well
all right so
maybe we should disambiguate there are a
variety of different kinds of
computation
i don't pretend to be an expert in
the theory of computation or anything
like that i guess it's important to
differentiate though between
digital logic
which represents information as a series
of bits binary digits which you know uh
you can think of them as zeros and ones
or whatever usually they correspond to
a physical system that has two very well
separated states
and then other kinds of computation like
we'll get into more the way your brain
works which
it is i think indisputably processing
information
but
where the computation begins and ends is
not anywhere near as well defined it it
doesn't depend on
these two levels here's a zero here's a
one it's there's a lot of gray area
that's usually referred to as analog
computing
um also
in in conventional digital computers or
um
digital computers in general
you have a concept of what's called
arithmetic depth which is jargon that
basically means how many
sequential operations are performed to
turn
an input into an output and those kinds
of computations in in digital systems
are highly serial
meaning that data streams they don't
branch off too far to the side you do
you have to pull some information over
there and access memory from here and
stuff like that but
by and large the the computation
proceeds in a serial manner
it's not that way in the brain in the
brain
you're always drawing information from
different places it's much more
network-based computing neurons don't
wait for their turn they fire when
they're ready to fire and so it's it's
asynchronous so one of the other things
about a digital system is you're
performing these operations on a clock
and that's a that's a crucial aspect of
it get rid of a clock in a digital
system nothing makes sense anymore the
brain has no clock it builds its own
time scales based on its internal
activity
so so you can think of the brain as kind
of uh
like this like network computation where
it's actually really trivial simple
computers
uh just a huge number of them and
they're networked
i would say it is complex sophisticated
little processors and there's a huge
number of things neurons are not no
offense i don't mean to offend sure no
they're very complicated and beautiful
and yeah but
we often oversimplify them yes they're
actually like there's computation
happening within a neuron right so i i
would say
to think of a a transistor as the
building block of a digital computer is
accurate you use a few transistors to
make your logic gates you build up more
you build up processors from logic gates
and things like that so you can think of
a transistor as a fundamental building
block or you can think of as we get into
more highly parallelized architectures
you can think of a processor as a
fundamental building block to make the
analogy to the
neuro side of things
a neuron is not a transistor a neuron is
a is a processor it has synapses even
synapses are not transistors but they
are more
um they're lower on the information
processing hierarchy in a sense they do
a bulk of the computation but neurons
are entire
processors in and of themselves that can
take in many different kinds of inputs
on many different spatial and temporal
scales and produce many different kinds
of outputs so that they can perform
different computations in different
contexts so this is where it enters this
distinction between computation and
communication
so you can think of neurons performing
computation
and the inter
networking the interconnectivity of
neurons is communication routine neurons
and you see this with very large server
systems i've been i mentioned offline
i've been talking to jim keller whose
dream is to build giant computers that
uh you know
the bottom like there's often the
communication between the different
pieces of computing
so in this paper that we mentioned
optoelectronic intelligence
you say electrons excel at computation
while
light
is excellent for communication
maybe you can linger and say in this
context what do you mean by computation
and communication
what what are electrons what is light
and why do they excel at those two tasks
yeah just to to first speak to
computation versus communication
i would say computation is essentially
taking in
some information
performing operations on that
information and producing new
hopefully more useful information so for
example
um
imagine you have a picture in front of
you
and
there is a key in it and that's what
you're looking for for whatever reason
you want to you want to find the key we
all want to find the key so
the input is that that entire picture
and the output might be the coordinates
where the key is so you've reduced the
total amount of information you have but
you found the useful information for you
in that present moment that's the useful
information you think about this
computation as like controlled
synchronous
sequential not necessarily it could be
that could be how
your system is performing the
computation or it could be
asynchronous it there are lots of ways
to find the key
it depends it depends on the nature of
the data depends on
um that's a very simplified example a
picture with a key in it what about if
you're in the world and you're trying to
decide the best way to
live your life you know that it might be
interactive it might be there might be
some recurrence or some weird
asynchrony i got it so but there's an
input and there's an output and you do
some stuff in the middle that yeah it
goes from the input to the app you've
taken in information and output
different information hopefully reducing
the total amount of information and
extracting what's useful yeah
communication is then
getting that information from the
location in which it's stored because
information is physical as landauer
emphasized and so it is more in one
place and you need to get that
information to another place so that
something else can
use it for whatever computation it's
working on maybe it's part of the same
network and you're all trying to solve
the same problem but neuron a over here
just
deduced something based on its inputs
and it's now sending that information
across the network to another location
so that would be the act of
communication can you linger on landau
and saying information is physical roth
landauer not to be confused with lev
landau
yeah and he he
made huge contributions to our our
understanding of
the reversibility of information in in
this concept that
energy has to be dissipated in computing
when the computation is irreversible but
if you can manage to make it reversible
then you you don't need to expend energy
but if you
um
if you do expend energy to perform a
computation there's sort of a minimal
amount that you have to do and it's kt
log2 and it's all somehow related to the
second law of thermodynamics and that
the universe is an information process
and then we're living in a simulation
so okay sorry sorry for that tangent so
information so that's the defining the
the distinction between computation and
communication
let me say one more thing just to
clarify communication
ideally does not change the information
it moves it from one place to another
but it is preserved
got it okay
all right that's beautiful so
uh then the an electron versus light
distinction and why are electrons
uh good at computation and light good at
communication yes
this is um
there's a lot that goes into it i guess
but just try to speak to the simplest
part of it
electrons interact strongly with one
another they're charged particles so if
i
pile a bunch of them over here
they're feeling a certain amount of
force and they want to they want to move
somewhere else they're strongly
interactive you can also get them to sit
still you can an electron has a mass so
you can you can cause it to
be spatially localized so for
computation that's useful because now i
can make these little devices that put a
bunch of electrons over here and then i
change the the state of
a gate like i've been describing put a
different voltage on this gate and now i
move the electrons over here now they're
sitting somewhere else i have
a physical mechanism with which i can
represent information it's spatially
localized and have knobs that i can
adjust to change where those electrons
are or what they're doing light by
contrast photons of light
uh which are the discrete packets of
energy that were identified by einstein
they do not interact with each other um
especially at low light levels if you're
in a medium and you have a high a bright
high light level
you you can get them to interact with
each other through the interaction with
that medium that they're in but that's
that's a little bit more exotic and for
the purposes of this conversation we can
assume that photons don't interact with
each other so if you have a bunch of
them
all propagating in the same direction
they don't interfere with each other if
i want to send
if i if i have a communication channel
and i put one more photon on it it
doesn't screw up with those other one it
doesn't change what those other ones
were doing at all
so that's really useful for
communication because that means you can
sort of
allow a lot of these photons to flow
uh
with without disruption of each other
and they can they can branch really
easily and things like that but it's not
good for computation because it's very
hard
for
this packet of light to change what this
packet of light is doing they they pass
right through each other so in
computation you want to change
information and if photons don't
interact with each other it's difficult
to get them to change the information
represented by the others so that that's
the fundamental difference is is there
also something about
the way they travel through different
materials
or is that just a particular engineering
no it's not that's deep physics i think
so
this gets back to electrons interact
with each other and photons don't so say
say i'm trying to get a
packet of information
from me to you
and we have a wire going between us in
order for me to send electrons across
that wire i first have to raise the
voltage on my end of the wire and that
means putting a bunch of charges on it
and then that that charge packet has to
propagate along the wire and it has to
get all the way over to you there's that
wire is going to have something that's
called capacitance which basically tells
you how much charge you need to put on
the wire in order to raise the voltage
on it and the capacitance is going to be
proportional to the length of the wire
so the longer the the length of the wire
is the more charge i have to put on it
and
the energy required to charge up that
line and move those electrons to you is
also proportional to the capacitance and
goes as the voltage squared so you get
this huge penalty if you if you want to
send
electrons across a wire over appreciable
distances so distance is an important
thing here when you're doing
communication distance is an important
thing so is the number of connections
i'm trying to make
me to you okay one that's not so bad if
i want to now send it to 10 000 other
friends
then then all of those wires are adding
tons of extra capacitance now not only
does it take forever to put the charge
on that wire and raise the voltage on
all those lines but it takes a ton of
power
and
the number 10000 is not randomly chosen
that's roughly how many connections each
neuron in your brain makes so it a
neuron in your brain needs to send 10
000 messages every time it has something
to say you can't do that if you're
trying to
drive electrons from here to 10 000
different places the brain does it in a
slightly different way which we can
discuss how can light achieve the 10 000
connections and why is it um why is it
better in terms of like the energy use
uh required to use light for the
communication of the ten thousand
connections right right so now instead
of trying to send electrons from me to
you i'm trying to send photons so i can
make what's called a guide which is just
a simple piece of a material it could be
glass like an optical fiber or silicon
on a on a chip and i just have to i just
have to inject photons into that
waveguide and independent of how long it
is independent of how many different
connections i'm making it doesn't change
the the voltage or anything like that
that i have to raise up on the on the
wire so if i have one more connection if
i add additional connections i need to
add more light to the waveguide because
those photons need to split and go to
different
paths that makes sense but i don't have
a capacitive penalty that sometimes
these are called wiring parasitics there
are no parasitics associated with light
in that same sense so
well just this might be a dumb question
but
how do i catch a photon on the other end
uh what's is it material is it's with
the polymer stuff you were talking about
for the
for a different application for
photolithography like how do you catch
photo there's a lot of ways to catch a
photon it's not a dumb question it's a
it's a deep and important question that
basically defines a lot of the work that
goes on in our group at nist
one of my group leaders
say woonam has built his career around
these superconducting single photon
detectors so
if you're going to try to sort of reach
a lower limit and detect just one
particle of light
superconductors come back into our
conversation and
just picture a simple device where you
have current flowing through a
superconducting wire
and um a loop again or no
let's say yes you have a loop so you
have a superconducting wire that goes
straight down like this and on on your
loop branch you have a little ammeter
something that measures current there's
a resistor up there too
go with me here so um
your current biasing this so there's
current flowing through that
superconducting branch since there's a
resistor over here
all the current goes through the
superconducting branch now a photon
comes in strikes that superconductor
we talked about this superconducting
macroscopic quantum state that's going
to be destroyed by the energy of that
photon so now that branch of the circuit
is resistive too
and you've properly designed your
circuit so that the resistance on that
superconducting branch is much greater
than the other resistance now all of
your current's going to go that way your
ammeter says oh i just got a pulse of
current that must mean i detected a
photon then where you broke that
superconductivity in a matter of a few
nanoseconds it cools back off dissipates
that energy and the current flows back
through that superconducting branch this
is a
very powerful
superconducting device that allows us to
understand quantum states of light i
didn't realize
a loop like that could be sensitive to a
single photon i mean that um
that seems strange to me because
i mean so what happens when you just
barrage it with photons if you put a
bunch of photons in there essentially
the same thing happens you just drive it
into the normal state it becomes
resistive
and it's not particularly interesting so
you have to be careful how many photons
you send like you have to be very
precise with your communication well it
depends so i would say that that's
actually in the in the application that
we're trying to use these detectors for
that's a feature because what we want is
for
uh
if if a neuron
sends one photon to a synaptic
connection and one of these
superconducting detectors is sitting
there
you get this pulse of current and that
synapse says
event
then i'm going to do what i do when
there's a synapse event i'm going to
perform computations that kind of thing
but if accidentally you send two there
or three or five it does the exact same
got it and so that's this is this is how
in the system that we're
devising here communication is entirely
binary and that's what i tried to
emphasize a second ago communication
should not change the information you're
not saying
oh i got this kind of communication
event four photons no we're not keeping
track of that this neuron fired this
synapse says that neuron fired that's it
so that's a that's a noise filtering
property
of those detectors however there are
other applications where you'd rather
know the exact number of photons that
can be very useful in quantum computing
with light
and
our group does a lot of work around
another kind of superconducting sensor
called a transition edge sensor that uh
adrian alita in our group does a lot of
work on that and
that can tell you based on the amplitude
of the current pulse you divert exactly
how many photons
were in that pulse
so what's that useful for just one way
that you can encode information in
quantum states of light is in the number
of photons you can have what are called
number states and a number state will
have a well-defined number of photons
and maybe the output of your quantum
computation encodes
its information in the number of photons
that are generated so if you have a
detector that is sensitive to that it's
extremely useful can you achieve
like a clock
with photons or is that not important is
there a synchronicity here
in general it can be important
uh clock distribution is a big challenge
in
especially large computational systems
and so yes optical clocks optical clock
distribution
is a is a very powerful technology i i
don't know the state of that field right
now but i imagine that if you're trying
to distribute a clock across
any appreciable size computational
system you you want to use light yeah i
wonder how these giant systems
work especially like uh super computers
do they need to do clock distribution or
are they
doing more ad hoc
parallel like concurrent programming
like there's some kind of locking
mechanisms or something that's the
fascinating question but the
let's zoom in at this very particular
question of
computation
on a processor and communication
between processors
so
what does this
system look like
that you're envisioning
one of the places you're envisioning it
is in the paper on optoelectronic
intelligence
so what are we talking about are we
talking about something that starts to
look a lot like the human brain or does
it still look a lot like a computer what
are the size of this thing is it go
inside a smartphone or as you said does
it go inside something that's more like
a house
like uh
what should we be imagining what are you
thinking about when you're thinking
about these fundamental systems
let me introduce the word neuromorphic
there's this concept of neuromorphic
computing where what that broadly refers
to is um
computing based on the information
processing principles of the brain
and as
digital computing
seems to be pushing towards some
fundamental performance limits
people are considering architectural
advances drawing inspiration from the
brain more distributed parallel network
kind of architectures and stuff and so
there's this continuum of
neuromorphic from
things that are
pretty similar to digital computers but
maybe there are more cores and the way
they send messages is a little bit more
like the way brain
neurons send spikes but for the most
part it's still digital electronics
and then you know you have some things
in between where maybe you're you're
using
transistors but now you're starting to
use them instead of in a digital way in
an analog way and so you're trying to
get those circuits to behave more like
neurons and then that's a little bit
quite quite a bit more on the
neuromorphic side of things you're
trying to get your circuits although
they're still based on silicon you're
trying to get them to
perform operations that are highly
analogous to the operations in the brain
and that's where a great deal of work is
in neuromorphic computing people like
yakimo and davari and gert kaunbergs
jennifer hasler countless others it's
it's a rich and exciting field
uh going back to carver mead in the
late 1980s and then all the way on the
other extreme of the continuum
is where you say i'll give up
anything related to transistors or
semiconductors or anything like that i'm
not not starting with the assumption
that i'm going to use any kind of
conventional computing hardware and
instead what i want to do is try and
understand what makes the brain powerful
at the kind of information processing it
does and i want to think from first
principles about
what hardware
is best going to enable us to
capture those information processing
principles in an artificial system
and that's where i live that's where
that's where i'm
doing my exploration these days so uh
what are the first principles
of
brain like computation communication
right yeah this is this is so important
and i'm glad we booked 14 hours for this
because uh i only have 13 i'm sorry
okay so the brain is notoriously
complicated and i think that's a
an important part of why it
why it can do what it does but okay let
me let me try to break it down
uh starting with the devices
neurons as i as i said before they're
they're sophisticated devices in and of
themselves and synapses are too they
they can um
change their state based on the activity
so they they adapt over time that's
crucial to the way the brain works they
don't just adapt on one time scale they
can adapt on
myriad time scales from
the the spacing between pulses the
spacing between spikes that come from
neurons all the way to the age of the
organism
um
also relevant perhaps
i think the most important thing that's
guided my thinking is the the network
structure
of the brain so which can also be
adjusted yeah in different scales
absolutely yes so so you're you're
making new con you're changing the
strength of contacts you're changing the
the spatial distribution of them
although spatial distribution doesn't
change that much once you're a mature
organism but that network structure is
is really crucial
so let me dwell on that for a second
um
you can't talk about the brain without
emphasizing that most of the neurons in
the the neocortex or the prefrontal
cortex the part of the brain that we
think is most responsible for
high-level reasoning and things like
that those neurons make thousands of
connections so you have this network
that is highly interconnected
and
i think it's safe to say that one of the
primary reasons that they make so many
different connections is that allows
information to be communicated very
rapidly from any spot in the network to
any other spot in the network so that's
a that's a sort of spatial aspect of it
you can quantify this in terms of
concepts that are related to fractals
and scale invariants which i think is is
a very beautiful concept so
what i mean by that is kind of no matter
what spatial scale you're looking at in
the brain
within certain bounds you see the same
general statistical pattern so if i draw
a box around some region of my cortex
most of the connections
that those neurons within that box make
are going to be within the box to each
other in their local neighborhood and
that's sort of called clustering loosely
speaking but a non-negligible fraction
is going to go outside of that box and
then if i draw a bigger box the pattern
is going to be exactly the same so you
have the scale and variance and you also
have a a non-vanishing
probability of a neuron making
connection very far away so
suppose you you want to plot the
probability of a neuron making a
connection as a function of distance if
that were an exponential function it
would go e to the minus
radius over some characteristic radius
and it would it would drop off
up to some certain radius the
probability would be
reasonable close to one and then a
beyond that characteristic length r0
it would it would drop off sharply and
so that would mean that the neurons in
your brain are really
localized and
that's not what we observe in instead
what you see is that the probability of
making a longer distance connection it
does drop off but it drops off as a
power law so the probability that you're
going to have a connection at some
radius r goes as r to the minus some
power and that's more that's what we see
with with forces in nature like the
electromagnetic force between two
particles or gravity
goes as one over the radius squared so
you can see this in fractals i love that
there's a like a fractal dynamics to the
brain
that
if you zoom out you draw the box and you
increase that box by certain step sizes
you're gonna see the same statistics i
think that's probably
very important to the way the brain
processes information it's not just in
the spatial domain it's also in the
temporal domain and what i mean by that
is
that's incredible that this emerged
through the evolutionary process that
potentially somehow connected to the way
the physics of the universe works yeah i
i couldn't agree more that it's it's a
deep and fascinating subject that i i
hope to be able to spend my life
studying you think you need to solve
understand this this fractal nature in
order to understand intelligence and
company i do think so i think they're
deeply intertwined yes i think power
laws are
right at the heart of it so just to just
to push that one through the same thing
happens in the temporal domain so
suppose you had
um
suppose your neurons in your brain were
always oscillating at the same frequency
then the probability of finding a neuron
oscillating as a function of frequency
would be this narrowly peaked function
around that certain characteristic
frequency that's not at all what we see
the probability of finding neurons
oscillating or
pulsing producing spikes at a certain
frequency is again a power law which
means there's no
there's no defined scale of the temporal
activity in the brain
it's you don't what at what
speed do your thoughts occur well
there's a there's a fastest speed that
can occur and that is limited by
communication and other other things but
there's not a characteristic scale we
have thoughts on all temporal scales
from you know a few
tens of milliseconds which is
physiologically limited by our devices
compare that to
tens of picoseconds that i talked about
in superconductors all the way up to the
lifetime of the organism you can still
think about things that happened to you
when you were a kid well if you want to
be really trippy then across multiple
organisms in the entirety of human
civilization you have thoughts that span
organisms right yes taking it to that
level if you're willing to see the
entirety of the human species as a
single organism with the collective
intelligence and that too on a spatial
and temporal scale there's thoughts
occurring and then if you look at not
just the human species but the entirety
of life on earth
is as an organism with thoughts that
occurring that are greater and greater
sophisticated thoughts there's a
different spatial and temporal skill
there
this is getting very suspicious
hold on though before we're done i just
want to just tie the bow yes and say
that the
the spatial and temporal aspects are
intimately interrelated with each other
so activity between neurons that are
very close to each other is more likely
to happen on this this faster time scale
and information is going to propagate
and encompass more of the brain more of
your cortices different modules in the
brain are going to be engaged in
information processing on longer time
scales so there's this concept of
information integration where
most neurons are neurons are specialized
any given neuron or any cluster of
neuron has its specific purpose but
they're also
very much integrated so you you have
neurons that specialize but share their
information and so that happens through
these fractal nested oscillations that
occur across spatial and temporal scales
i think capturing those dynamics in
hardware
to me that's the goal of of neuromorphic
computing so does it need to look so
first of all that's fascinating we
stated some clear principles here
now does it have to
look like the brain outside of those
principles as well like what other
characteristics have to look like the
human brain or can it be something very
different well it depends on what you're
trying to use it for and so i i think a
lot of the community
asks that question a lot what are you
going to do with it and
i i completely get it i think that's a
very important question and it's also
sometimes not the most helpful question
what if what you want to do with it is
study it what if you just want to see
um
what does it what do you have to build
into your hardware in order to observe
these dynamical principles
so
and also
i ask sometimes i ask myself that
question every day and i'm not sure i'm
able to answer that it's like what are
you what are you gonna do with this
particular neuromorphic machine so
suppose what we're trying to do with it
is build something that thinks we're not
trying to get it to make us any money or
drive a car maybe we'll be able to do
that but that's not our goal our goal is
to
see if we can get the same types of
behaviors that we observe in our own
brain and by behaviors in this sense
what i mean
the behaviors of the
components the neurons the network that
kind of stuff i think there's another
element that i didn't really hit on that
that you also have to build into this
and those are architectural principles
they have to do with the
hierarchical modular construction of the
network and without getting too lost in
jargon
the the main point that i think is
relevant there let me try and illustrate
it with a cartoon picture of
the architecture of the brain so in the
brain you have the
the cortex which is sort of this outer
sheet um it's actually a you can it's a
layered structure you can if you could
take it out of your brain you could
unroll it on the table and it would be
about the size of a of a pizza sitting
there
and um
that's a module it it does certain
things it it processes as yorgi buzaki
would say it processes the what of of
what's going on around you but you have
another really crucial module that's
called the hippocampus
and that that network is structured
entirely differently first of all this
this cortex that had described 10
billion neurons in there so numbers
matter here
and they're they're organized in that
sort of power law distribution where the
probability of making a connection drops
off as a power law in space the
hippocampus is another module that's
important for
understanding how
where you are
when you are um
keeping track of of your your position
in space and time and that network is
very much random so the probability of
making a connection
it it almost doesn't even drop off as a
function of distance it's the same
probability that you'll make it here to
over there but there are only about 100
million neurons there so you can have
that huge densely connected module
because
it's not so big
and the neocortex or the cortex and the
hippocampus they talk to each other
constantly
and that communication is largely
facilitated by what's called the
thalamus i'm not a neuroscientist here
i'm trying to do my best to recite this
cartoon picture of the brain i gotcha
yeah something like that so this
thalamus is is coordinating the activity
between the neocortex and the
hippocampus and making sure that they
they talk to each other at the right
time and send messages that would be
useful to one another so this all taken
together is called the thalamocortical
complex
and it seems like building something
like that is going to be crucial to
capturing the
types of activity we're looking for
because
though those responsibilities those
separate modules they do different
things that's got to be central to
um
achieving these
states of efficient information
integration across space and time
by the way
i am able to achieve this state by
watching uh simulations visualizations
of the thelma cortical complex there's a
few people i forget from where they've
created these incredible visual
illustrations of like
visual stimulation from the eye or
something like that it this in this
image like flowing through the brain wow
i haven't seen that i gotta check that
out so it's one of those things you you
find this stuff in the world
and you see like on youtube it has like
1000 views these like
these visualizations of the human brain
processing information and like
because there's uh there's chemistry
there like because this is act from
actual human brains i don't know how
they're doing the coloring but they're
able to actually trace
the uh like different the the chemical
and the electrical signals throughout
the brain and the visual thing it's like
whoa because it looks kind of like the
universe i mean the whole thing is just
incred i recommend it highly i'll
probably post a link to it but you can
just look for uh
um
one of the things they simulate is the
uh
thelma cortical uh complex and just
visualization you can find that yourself
on youtube but it's it's beautiful um
the other question i have for you is um
how does memory play into all of this
because all the signals sending back and
forth that's kind of like uh that's
computation
and communication but that's kind of
like
uh you know processing of
inputs and outputs uh to produce outputs
in the system that's kind of like maybe
reasoning maybe there's some kind of
recurrence
but like is there a storage mechanism
that you think about in the context of
neuromorphic computing yeah absolutely
so that's got to be central you have to
have a way that you can
store memories and
there are a lot of different kinds of
memory in the brain that's
yet another example of how it's it's not
a simple system so
there's one kind of memory one way of
talking about memory
uh usually starts in the context of
hopfield networks you were lucky to talk
to john hopfield on this program but the
the basic idea there is uh working
memory is stored in the dynamical
patterns of activity between neurons and
you can you can think of a certain
pattern of activity as an attractor
meaning if you put in
some
signal that's similar enough to other
previously experienced signals
like that then you're going to converge
to the same network dynamics and you
will see
these neurons
participate in the same network patterns
of activity that they have in the past
so you can talk about the probability
that
different inputs will allow you to
converge to different basins of
attraction and you might think of that
as oh i saw this face and then i excited
this network pattern of activity because
last time i saw that face i was at you
know what some movie and that that's a
famous person on the screen or something
like that so so that's one memory
storage mechanism but crucial to
the ability to imprint those memories in
your brain is the ability to change the
strength of connection between one
neuron and another that synaptic
connection between them so synaptic
weight update is a massive field of
neuroscience and neuromorphic computing
as well so
there are
two
poles to that
on that spectrum one in okay so in more
in the language of machine learning we
would talk about supervised and
unsupervised learning
in when i'm trying to tie that down to
neuromorphic computing i will use a
definition of supervised learning which
basically means
the external
user the person who's controlling this
hardware
has some knob that they can
tune to change each of the synaptic
weights depending on whether or not the
network's doing what you want it to do
whereas what i mean in this conversation
when i say unsupervised learning is that
those synaptic weights are are
dynamically changing in your network
based on nothing that the user is doing
nothing that there's no wire from the
outside going into any of those synapses
the network itself is reconfiguring
those synaptic weights based on
physical
properties that you've built into the
devices so if if the synapse receives a
pulse from here
and that causes the neuron to spike
some circuit built in there with no help
from me or anybody else adjusts the
weight in a way that makes it more
likely to
store the useful information and excite
the useful network patterns and makes it
less likely that random noise useless
communication events will have an
important uh effect on the network
activity so there's memory encoded in
the weights uh the synaptic weights yeah
what about the formation of something
that's not often done in machine
learning the formation of new synaptic
connections right well that seems to so
again not not a neuroscientist here but
my reading of the literature is that
that's particularly crucial in early
stages of brain development where a
newborn is uh
born with tons of extra synaptic
connections and it's actually pruned
over time so the number of synapses
decreases as opposed to growing new
long-distance connections it is possible
in the brain to grow new neurons and
um
assign new synaptic connections
but it doesn't seem to be the primary
mechanism by which the brain is learning
so for example like right now
sitting here talking to you you say lots
of interesting things and i learn what i
learn from you and i can remember things
that you just said and i didn't grow new
axonal connections down to new synapses
to to enable those it's
plasticity mechanisms in the between the
synaptic connections between neurons
that enable me to
learn on that time scale so at the very
least that you can sufficiently
approximate that with just weight
updates you don't need to form new
connections i would say weight updates
are a big part of it i also think
there's more because
broadly speaking when we're doing
machine learning our networks say we're
talking about feed forward deep neural
networks
the temporal domain is not really part
of it okay you're gonna put in an image
and you're gonna get out of
classification and you're gonna do that
as fast as possible so you care about
time but time is not part of the essence
of this thing really
um whereas in spiking neural networks
what we see in the brain
time is as crucial as space and they're
intimately intertwined as i've tried to
say
and so adaptation on
on different time scales is important
not not just in memory for formation
although it plays a key role there but
also in just keeping the activity in a
useful dynamic range so you have other
plasticity mechanisms not just weight
update or at least not on the
time scale of many action potentials but
even on the shorter time scale so a
synapse can
become much less efficacious it can it
can transmit a weaker signal after the
second third fourth that can
second third fourth action potential to
occur in a sequence so that's what's
called short-term synaptic plasticity
which is a form of learning you're
learning that i'm getting too much
stimulus from looking at something
bright right now so i need to tone that
down you know
there's also another really important
mechanism in learning it's called
metaplasticity what that seems to be is
a
a way that you change not the weights
themselves but the rate at which the
weights change so
when i am in
say a lecture hall and my
this is a
potentially terrible cartoon example but
let's say i'm in a lecture hall and uh
it's time to learn right so my brain
will release more perhaps dopamine or
some
neuromodulator that's going to change
the the rate at which synaptic
plasticity occurs so that can make me
more sensitive to learning at certain
times more sensitive to overwriting
previous information and less sensitive
at other times and finally as long as
i'm rattling off the list i think
another concept that falls in the
category of learning or memory
adaptation is homeostasis or homeostatic
adaptation where
neurons have the ability to
control their firing rate so if if one
neuron is just like blasting way too
much it will naturally tone itself down
it's its threshold will adjust
so that it's it stays in a useful
dynamical range and we see that that's
that's captured in in deep neural
networks where you don't just change the
synaptic weights but you can also move
the thresholds of simple neurons in
those models and so to uh
to achieve
this
spiking neural networks
you want to use
like
you want to implement the first
principles that you mentioned of the
temporal and the spatial
fractal dynamics here so you can you can
communicate locally you can communicate
across much greater distances and do the
same thing
in space and do the same thing in time
now
you have like a chapter called
superconducting hardware for
neuromorphic computing so what are some
ideas
that integrates some of the things we've
been talking about in terms of the first
principles of neuromorphic computing
and the ideas that you outline in uh
optoelectronic intelligence
yeah
so let me start i guess on the
communication side of things because
that's
what led us down this track in the first
place by us i'm talking about
my my team of colleagues at nist
you know saeed han bryce primavera sonia
buckley jeff chiles adam mcconnel to
name alex tate name a few our group
leaders cew nam and rich mirin we've all
contributed to this so this is not this
is not me saying
necessarily just the things that that
i've proposed but sort of where our
team's thinking has evolved over the
years can i can quickly ask what is nist
and where is this amazing group of
people located nist is the national
institute of standards and technology
the
the the larger facility is out in
gaithersburg maryland our team is
located in boulder colorado
um
we nist is a is a
federal agency under the department of
commerce we do a lot with by we i mean
other people at nist would do a lot with
standards you know um
making sure that we understand the
system of units international system of
units uh precision measurements there's
a lot going on in
uh electrical engineering material
science and it's historic i mean i mean
it's like it's one of those it's like
mit or something like that it has a
reputation over many decades of just
being this really um
a place where there's a lot of brilliant
people have done
a lot of amazing things but in terms of
the people in your team
in this team of people involved in the
concept we're talking about now i'm just
curious what kind of disciplines are we
talking about what is it mostly
physicists and electrical engineers some
material scientists
but i would say
yeah i think physicists and electrical
engineers my background is in photonics
the use of light for technology so
coming from there i i tend to
have found colleagues that are more from
that background although
uh adam akan more of a superconducting
electronics background we need a
diversity of folks this project is sort
of cross-disciplinary i would love to be
working more with neuroscientists and
things um but
we haven't we haven't reached that scale
yet but yeah
you're focused on the hardware side
which requires all the disciplines that
you mentioned yes and then of course
neuroscience may be a source of
inspiration for some of the the the
long-term vision i would actually call
it more than inspiration i would call it
sort of um
a road map you know
we're not trying to to build exactly the
brain but i don't think it's enough to
just say oh neurons kind of work like
that let's kind of do that thing i mean
we're
very much following the concepts that
the cognitive sciences have laid out for
us which i believe is is a
really robust road map i mean just on a
little bit of a tangent it's often
stated that we just don't understand the
brain and so it's really hard to
replicate it because we just don't know
what's going on
and i maybe
five or seven years ago i would have
said that but
as i got more interested in the subject
i had read more of the neuroscience
literature and i was just taken by the
exact opposite sense i can't believe how
much they know about this i can't
believe how
mathematically rigorous and um sort of
theoretically complete a lot of the
concepts are that's not to say we
understand consciousness or we
understand the self or anything like
that but why is the brain what is the
brain doing and why is it doing those
things we have a neuroscientists have a
lot of answers to those questions so
there's a lot if you're a hardware
designer that just wants to get going
whoa it's pretty clear which direction
to go in i think
okay uh so
i love i love the the optimism behind
that but um in the implementation of
these systems
that uh uses supercontext super
connectivity
how do you make it happen
so to me it starts with thinking about
the communication network you know for
sure that
the ability of each neuron to
communicate to many thousands of
colleagues across the network is
indispensable i take that as a core
principle of my
my architecture my thinking on the
subject
so coming from
coming from a background in photonics it
was very natural to say okay we're going
to use light for communication just in
case listeners may not know
light is often used in communication i
mean if you think about radio that's
light it's long wavelengths but it's
electromagnetic radiation it's the same
it's the same physical phenomenon
obeying exactly the same maxwell's
equations and then all the way down to
uh fiber fiber optics now you're using
visible or near-infrared wavelengths of
light but the way you send messages
across the ocean is now contemporary
over optical fibers so using light for
communication is
not a stretch it makes perfect sense so
you might ask well why don't you use
light for communication in a
conventional
microchip and the answer to that is i
believe
physical it's v we
a light source on a silicon chip
that was as simple as a transistor
we would there would not be a processor
in the world that didn't use light for
communication at least above some
distance how many light sources are
needed oh you need a light source at
every single point a light source per
neuron per neuron per per liter but then
if you could have a really small and
nice light source you can
your definition of neuron could be
flexible could be yes yes sometimes it's
helpful to me to say
in this hardware a neuron is that entity
which has a light source that
and i i can and then there was light
yeah i mean i can explain more about
that but um somehow this like rhymes
with consciousness because the the
people will often say the light of
consciousness so that consciousness is
that which is conscious i got it
that's not my quote that's me that's my
quote
uh you see that quote comes from my
background yours is in optics mine in
light mine is in darkness
so go ahead
so what the point i was making there is
that
if it was easy to manufacture light
sources along with transistors on a
silicon chip they would be everywhere
yeah and it's not easy it's there people
have been trying for decades and it's
actually extremely difficult i think an
important part of our research is is
dwelling right at that at that spot
there so is it physics or engineering
so okay so it's it's physics i think so
and what i mean by that is
as as we discussed
silicon is the material of choice for
transistors and it it's
very difficult to imagine that that's
going to change anytime soon silicon is
notoriously bad at emitting light and
that has to do with
the immutable properties of silicon
itself the way that the energy bands are
structured in silicon you're never going
to make silicon efficient as a light
source at room temperature
without doing very exotic things that
degrade its ability to interface nicely
with those transistors in the first
place so
that's that's like one of these things
where it's why why is nature dealing us
that blow you give us these beautiful
transistors and you give us all the
motivation to use light for
communication but then you don't give us
a light source so well okay you do give
us a light source compound
semiconductors like we talked about back
at the beginning an element from group
three and an element from group five
from an alloy where every other lattice
site switches which element it is those
have much better properties for
generating light
you put electrons in light comes out
almost 100 percent of the the electron
hole
it can be made efficiently
i'll take your warfare okay however i
say it's physics not engineering because
it's very difficult to get those
compound semiconductor light sources
situated with your silicon in order to
do that ion implantation that i talked
about at the beginning high temperatures
are required
so you you got to make all of your
transistors first and then put the
compound semiconductors on top of there
you can't grow them afterwards because
that requires high temperature it screws
up all your transistors you try and
stick them on there they don't have the
same lattice constant the spacing
between atoms is different enough that
it just doesn't work so
nature does not seem to be telling us
that hey go ahead and combine light
sources with your digital switches for
conventional digital computing and
conventional digital computing will
often
require smaller scale i guess in terms
of like smartphone like
so in which kind of systems
can does nature hint that we can use
um
light and photons for communication well
so let me just try and be clear you can
use light for communication in digital
systems just the light sources are not
intimately integrated with the silicon
you you manufacture all the silicon you
have your microchip
plunk it down and then you manufacture
your light sources separate chip
completely different process made in a
different foundry
and then you put those together at the
package level got it so now you have
you have some
i would say a great deal of
architectural limitations that
are introduced by that sort of
package level integration as opposed to
monolithic on the same chip integration
but it's still a very useful thing to do
and that's
where i had done some work previously
before i came to nist there's a project
led by vladimir stoyanovic that now spun
out into a company called ir labs led by
mark wade and chen sun where they're
doing exactly that so you have your
light source chip
your
silicon chip whatever it may be doing
maybe it's digital electronics maybe
it's some other control purpose
something
and the the silicon chip
drives the the light source chip and
modulates the intensity of the light so
you can get data out of the package on
an optical fiber and that still gives
you tremendous advantages in bandwidth
as opposed to sending those signals out
over electrical lines
but
it is
somewhat peculiar to my eye that
they have to be integrated at this
package level and those those people i
mean they're so smart those are my
my colleagues that i respect a great
deal so it's it's very clear that it's
not just
they're making a bad choice this is what
physics is telling us it just wouldn't
make any sense to to try to stick them
together yeah so there even if it's
difficult
um
it's uh easier than the alternative
unfortunately i think so yes and i again
i need to go back and and make sure that
i'm not taking the wrong way i'm not
saying that the pursuit of integrating
compound semiconductors with silicon is
fruitless and shouldn't be pursued it
should and people are doing great work
kaimei lao and john bowers others
they're
they're doing it and they're making
progress but to my eye it doesn't look
like that's ever going to be
just
the standard
monolithic light source on silicon
process i i just don't see it it's yeah
so nature kind of points the way usually
and if you resist nature it's just
you're gonna have to do a lot more work
and it's gonna be expensive and not
scalable got it but okay so let me uh
let's go like far into the future let's
imagine this gigantic neuromorphic
computing system
that uh simulates all of our realities
it currently is mentioned matrix four so
this thing
uh this powerful computer
how does it operate like so so what what
are the neurons
what is the communication what's your
sense all right so let me let me now
after spending 45 minutes trashing light
source integration with silicon let me
now say why i'm basing my entire life
yeah professional life on
integrating light sources with
electronics i think the game is
completely different when you're talking
about superconducting electronics
for
several reasons um
let me try to go through them
one is that as i mentioned it's
difficult to integrate those compound
semiconductor light sources with silicon
with silicon
is a requirement that is introduced by
the fact that using semiconducting
electronics in superconducting
electronics you're still going to start
with a silicon wafer but it's it's just
the bread for your sandwich in a lot of
ways you're not using that silicon in
precisely the same way for the
electronics you're now depositing
superconducting materials on top of that
the prospects for integrating light
sources with
that kind of an electronic process are
certainly less explored but i think much
more promising because you don't need
those light sources to be intimately
integrated with the transistors that's
where the problems come up they don't
need to be lattice matched to the
silicon all that kind of stuff instead
it seems possible that you can take
those compound semiconductor light
sources
stick them on the silicon wafer and then
grow your superconducting electronics on
the top of that it's at least not
obviously going to fail so the
computation would be done on the
superconductive material as well yes the
computation is done in the
superconducting electronics
and the light sources receive signals
that say hey a neuron reach threshold
produce a pulse of light send it out to
all your downstream synaptic connections
those are again super conductive elect
superconducting electronics perform your
computation and you're off to the races
your network works so then if we can
rewind real quick so what are the
limitations of the challenges of super
conducting electronics
when we think about constructing these
kinds of systems so
actually let me let me say uh one other
thing about the light sources yes please
and then i'll then i'll move on i
promise because this is this is probably
tedious first this is super exciting
okay one other thing about the light
sources i said that silicon is terrible
at emitting photons it's just not what
it's meant to do however the game is
different when you're at low temperature
if you're working with superconductors
you have to be at low temperature
because they don't work otherwise when
you're at 4 kelvin silicon is not
obviously a terrible light source it's
still not as efficient as compound
semiconductors but it might be good
enough for this application
the final thing that i'll mention about
that is again leveraging superconductors
as i said
in in a different context
superconducting detectors can receive
one single photon
in that conversation i failed to mention
that semiconductors can also receive
photons that's the primary mechanism by
which it's done
a camera in your phone that's receptive
to visible light it's is receiving
photons it's based on silicon or you can
make it in different semiconductors for
different wavelengths
but it requires on the order of a
thousand a few thousand photons to
receive a pulse
now when you're using a superconducting
detector you need one photon exactly one
i mean
one or more
so the fact that your synapses can now
be based on superconducting detectors
instead of semiconducting detectors
brings the light levels that are
required down by some three orders of
magnitude so now
you don't need good light sources you
can have the world's worst light sources
as long as they spit out maybe a few
thousand photons every time a neuron
fires
you have
the heart you have the hardware
principles in place that you might be
able to do perform this optoelectronic
integration to me optoelectronic
integration is it's just so enticing we
want to be able to leverage electronics
for computation light for communication
working with silicon microelectronics at
room temperature that has been
exceedingly difficult
and i hope
that when we move to the superconducting
domain
target a different application space
that is neuromorphic instead of digital
and use superconducting detectors
maybe optoelectronic integration comes
to us okay so there's a bunch of
questions so one is temperature
so in these kind of hybrid heterogeneous
systems
what's the temperature what are some of
the constraints of the operation here
does it all have to be a four kelvin as
well four kelvin everything has to be at
four kelvin
okay so what are the other engineering
challenges of making this kind of
optoelectronic systems
let me just dwell on that four kelvin
for a second because some people hear
four kelvin and they just get up and
leave they just say i don't i'm not
doing it you know and to me that's very
earth-centric species-centric we live in
300 kelvin so we want our technologies
to operate there too i totally get it
yeah what's zero celsius zero celsius is
273 kelvin so
we're talking very very cold here this
is this is even boston cold
this is real cold yeah siberia no okay
so just for reference the the
temperature of the cosmic microwave
background is about 2.7 kelvin so we're
still
warmer than deep space
good so that when the universe
dies out
it'll be colder than 4k it's already
colder than 4k in the in the expanses
you know you don't have to get that far
away from the earth in order to to drop
down to not far from what you're saying
is the aliens that live at the edge
of the observable universe
are using superconductive material for
their computation they don't have to
live at the edge of the universe the
aliens that are more advanced than us
in their solar system are
doing this
in their asteroid belt
we can get to that oh
because of the they can get that to that
temperature easier there sure yeah all
you have to do is reflect the sunlight
away and you have a huge head start oh
so the sun is the problem here yeah like
it's warm here on earth yeah okay okay
so can you uh so how do we get to 4k
what's well okay so very different kind
of 4k
temperature yeah what i want to say
about temperature is that
if you can swallow that if you can say
all right i give up
applications that have to do with my
cell phone
and the convenience of
you know a laptop on a train and you
instead
for me i'm i'm very much in the
scientific head space i'm not looking at
products i'm not looking at what this
will be useful to sell to consumers
instead i'm thinking about scientific
questions well it's just not that bad to
have to work at 4 kelvin we do it all
the time in our labs at nist and so does
i mean
for reference the entire quantum
computing
sector
usually has to work at something like
100 millikelvin 50 millikelvin so now
you're talking another factor of 100
even colder than that a fraction of a
degree and everybody seems to think
quantum computing is going to take over
the world that
it's so much more expensive to have to
get that extra
factor
of 10 or whatever colder
and yet it's not stopping people from
investing in in that area and by
investing i i mean putting their
research into it as well as venture
capital or whatever so oh so so based on
the energy of what you're commenting on
i'm getting a sense that's one of the
criticism of this approach is 4k for
kelvin
is uh is a big negative it is the show
stopper for a lot of people they just i
mean and understandably i i'm not saying
that
that that's not a consideration of
course it is for for some okay so
different motivations for different
people in the academic world suppose you
spent your whole life learning about
silicon microelectronic circuits you you
send a design to a foundry they send you
back a chip
and you go test it at your tabletop
and now i'm saying here now learn how to
use all these cryogenics so you can do
that at 4 kelvin
no come on man i don't want to do that
that sounds it's the old momentum the
titanic or the turning yeah kind of but
you're saying that's not that too much
of a finding when we're looking at large
systems and the gain you can potentially
get from them that's not that much of a
cost and when you want to answer the
scientific question about what are the
physical limits of cognition
well the physical limits they don't care
if you're at 4 kelvin if you can perform
cognition at a scale orders of magnitude
beyond any room temperature technology
but you got to get cold to do it
you're going to do it and to me that's
the interesting
application space
it's not even an application space
that's the interesting scientific
paradigm
so i i personally am not going to let
low temperature
stop me from realizing
a technological
domain or or realm that is
achieving in most ways everything else
that i that i'm looking for in my
hardware so that okay that's a big one
is there other kind of engineering
challenges that you envision yeah yeah
yeah so let me take a moment here
because i haven't really described
what i mean by a neuron or a network in
this particular hardware yeah do you
want to talk about loop neurons and
there's so many fascinating but you just
have so many amazing papers that people
should definitely check out and uh the
titles alone are just killers so anyway
go ahead right so let me say big picture
based on optics photonics for
communication superconducting
electronics for computation how how does
this all work so
a neuron in this in this hardware
platform can be thought of as circuits
that are based on joseph's injunctions
like we talked about before
where every time a photon comes in so
let's start by talking about a synapse a
synapse receives a photon one or more
from a from a different neuron and it
converts that optical signal to an
electrical signal
the amount of current that that adds to
a loop is controlled by the synaptic
weight so as i said before you're
popping fluxons into a loop right so
a photon comes in it hits a
superconducting single photon detector
one photon the the absolute physical
minimum that you can communicate from
one place to another with light and that
detector then converts that to an
electrical signal and the amount of
signal is uh correlated with some kind
of weight yeah so the synaptic weight
will tell you how many fluxons you pop
into the loop it's an analog number
we're doing analog computation now well
can you just linger on that what the
heck is the flux on are we supposed to
know this or is this is this a funny uh
it's like the big bang is this is this a
funny word for something deeply
technical no let's let's try to avoid
using the word flux line because it's
not actually necessary when a when a
photo is fun to say though so uh so
it's very necessary i would say when
when a photon hits that superconducting
single photon detector
current is added to a superconducting
loop
and the the amount of current that you
add can is an analog value can have
eight bit equivalent resolution
something like that
10 bits maybe that's amazing by the way
this is starting to make a lot more
sense when you're using superconductors
for this the energy
of that circulating current is
is less than the energy of that photon
so your
your energy budget is not destroyed by
doing this analog computation
so now in the language of a
neuroscientist you would say that's your
post-synaptic signal you have this
current being stored in a loop you can
decide what you want to do with it most
likely you're going to have it decay
exponentially so
every single synapse is going to have
some given time constant
and that's determined by set by putting
some
resistor in that in that superconducting
loop so a synapse a synapse event occurs
when a photon strikes a detector adds
current to that loop it decays over time
that's the postsynaptic signal then you
can process that in a dendritic tree
bryce primavera and i have a paper
that we've submitted about that
for the more neuroscience oriented
people there's a lot of dendritic
processing a lot of plasticity
mechanisms you can implement with
essentially exactly the same circuits
you have this
one simple building block circuit that
you can use for a synapse for a dendrite
for the neuron cell body for all the
plasticity functions it's all based on
the same building block just tweaking a
couple parameters so this basic building
block has both an optical and an
electrical component and then you can
just
build arbitrary large systems with that
close you're not at fault for thinking
that that's what i meant what i what i
should say is that
if you want it to be a synapse you tack
a detector a superconducting detector
onto the front of it and if you want it
to be anything else there's no optical
component got it so at the front
optics in the front uh electrical stuff
in the back electrical yeah in the
processing and in the the output signal
that it sends to the next stage of
processing
further so the dendritic trees is
electrical it's all electrical it's all
electrical in the super domain for
anybody who's up on their
superconducting circuits it's just based
on a dc squid the most ubiquitous it's
which is a circuit composed of two
joseph's injunctions so it's it's a very
bread and butter kind of thing and then
the only place where you go beyond that
is the neuron cell body itself it's
receiving all these electrical inputs
from the synapses or dendrites or
however you've structured that
particular unique neuron and when it
reaches its threshold which occurs by
driving a joseph's injunction above its
critical current it produces a pulse of
current which starts an amplification
sequence voltage amplification that
produces light out of a transmitter so
one of one of our colleagues adam akan
and sonia buckley as well did a lot of
work on the
the light sources and the
amplifiers that drive the current and
produce sufficient voltage to drive
current through that now semiconducting
part so that light source is the
semiconducting part of a neuron
and that so the neuron has reached
threshold it produces a pulse of light
that light then fans out across a
network of wave guides to reach all the
downstream synaptic terminals
that
do perform this process themselves so
it's probably worth explaining what a
network of wave guides is because a lot
of listeners aren't going to know that
look up the papers by jeff chiles on
this one but basically
light can be guided in a a simple
basically wire of usually an insulating
material so
silicon silicon nitride different
different kinds of glass just like in a
fiber optic it's glass silicon dioxide
that makes it a little bit big we want
to bring these down so we use different
materials like silicon nitride but
basically just imagine a rectangle of
some material
that just goes and branches
forms different
different branch points that target
different sub-regions of the network you
can transition between layers of these
so now we're talking about building in
the third dimension which is absolutely
crucial so that's what waveguides are
so yeah that's great uh what why the
third dimension is crucial
okay so yes you were talking about what
are some of the technical limitations
one of the things that
i believe we have to grapple with is
that
our brains are miraculously compact for
the number of neurons that are in our
brain
it sure does fit in a small volume as it
would have to if we're going to be
biological organisms that are resource
limited and things like that
any kind of hardware neuron is almost
certainly going to be much bigger than
that if it is of comparable complexity
even whether it's based on silicon
transistors okay a transistor seven
nanometers that doesn't mean a
semiconductor based neuron is seven
nanometers they they're big they they
require
many transistors different other things
like capacitors and things that store
charge they end up being
on the order of a hundred microns by a
hundred microns and it's difficult to
get them down any smaller than that the
same is true for superconducting neurons
and the same is true if we're trying to
use light for communication even if
you're using electrons for communication
you you have these wires where
okay the elect the size of an electron
might be angstroms but the size of a
wire is not angstroms and if you try and
make it narrower the resistance just
goes up so it you don't actually win
to communicate over long distances you
need your wires to be
microns wide and it's the same thing for
wave guides waveguides are essentially
limited by the wavelength of light and
that's going to be about a micron so
whereas compare that to an axon the
analogous component in the brain which
is
10 nanometers in diameter something like
that they
they're bigger when they need to
communicate over long distances but
grappling with the size of these
structures is inevitable and crucial and
so
in order to make systems of comparable
scale to the human brain by scale here i
mean number of interconnected neurons
you absolutely have to be using the
third spatial dimension
and that means
on the wafer you need multiple layers of
both active and passive components
active i mean
superconducting electronic circuits
that are performing computations and
passive i mean these wave guides that
are routing the optical signals to
different places you have to be able to
stack those if you can get to something
like
10 planes of each of those or maybe not
even 10 maybe 5 6 something like that
then you're in business now you can get
millions of neurons on a wafer
but that's not that's not anywhere close
to the brain scale in order to get to
the scale of the human brain you're
going to have to also use the third
dimension in the sense that
entire wafers need to be stacked on top
of each other with fiber optic
communication between them and we need
to be able to fill
a space the size of this table with
stacked wafers
and that's when you can get to some 10
billion neurons like your human brain
and i i don't think that's specific to
the optoelectronic approach that we're
taking i think that applies to any
hardware where you're trying to reach
commensurate scale and complexity as the
human brain so you need that fractal
stacking so stacking on the wafer and
stacking of the wafers and then whatever
the system that combined this stacking
of the tables with the wafers and it has
to be fractal all the way you're exactly
right because that's the only way that
you can efficiently get information from
a small point to across that whole
network it has to have the the power law
connecting and photons are like uh
optics throughout yeah absolutely once
you're at this scale to me it's just
obvious of course you're using light for
communication you have
fiber optics
given to us
you know from nature so simple
the the thought of even trying to this
any kind of electrical communication
just doesn't
it doesn't make sense to me i'm not
saying it's wrong i don't know but
that's where i'm coming from so let's
return to loop neurons
why are they called loop neurons
yeah the term loop neurons comes from
the fact like we've been talking about
that they rely heavily on these
superconducting loops so
even in
a lot of forms of digital computing with
superconductors storing uh a signal in a
superconducting loop is a a primary
technique
in this particular case it's it's just
loops everywhere you look so the
the strength of a synaptic weight is
going to be set by the state the amount
of current circulating in a loop that is
coupled to the synapse so
memory is
implemented as current circulating in a
superconducting loop
the the coupling between say a synapse
and a dendrite or a synapse in the
neuron cell body occurs through
loop loop coupling through transformer
so current circulating in a synapse is
going to induce current in a in a
different loop a receiving loop in the
in the neuron cell body
so
since all of the computation is
happening in these flux storage loops
and they play such a central role in in
how the information is processed how
memories are formed all that stuff
i didn't think too much about it just
call them loop neurons because it rolls
off the tongue a little bit better than
superconducting optoelectronic neurons
okay so uh how do you design circuits
for these loop neurons that's a great
question there's a lot of different
scales of design so
at the level of just one synapse you can
use conventional methods
they don't they're not that complicated
as as far as superconducting electronics
goes it's just
for joseph's injunctions or something
like that depending on how much
complexity you want to add so you can
just directly simulate each component in
in spice
we've been what
it's standard electrical simulation
software okay basically so you're just
you're just explicitly solving the
differential equations that describe the
circuit elements and then you can stack
these things together in that simulation
software to then build circuits you can
but that becomes computationally
expensive so one of the things when when
covet hit we knew we had to turn some
attention to more
things you can do at home in your
basement or whatever and one of them was
was
computational modeling so
we started working on
adapting
abstracting out the circuit performance
so that you don't have to explicitly
solve the circuit equations which for
joseph's injunctions usually needs to be
done on like a picosecond time scale and
you have
a lot of nodes in your circuit so it
results in a lot of differential
equations that need to be solved
simultaneously we were looking for a way
to simulate these circuits
that is scalable up to networks of
millions or so neurons is sort of where
we're targeting right now
so we were able to
analyze the behavior of these circuits
and as i said there it's based on these
simple building blocks so you really
only need to understand this one
building block and if you get a good
model of that boom it tiles and you you
can change the parameters in there to
get different behaviors and stuff but
it's all based on now it's one
differential equation that you need to
solve so
one differential equation for every
synapse dendrite or neuron in your in
your system and for the neuroscientists
out there it's just a simple leaky
integrated fire model
leaky integrator basically the synapse
is a leaky integrator a dendrite is a
leaky integrator so i'm really
fascinated by how
this one simple component
can be used to achieve lots of different
types of dynamical activity
and
to me that's where scalability comes
from and also complexity as well
complexity is often characterized by
relatively simple building blocks
connected in
potentially simple or sometimes
complicated ways and then emergent new
behavior that was hard to predict from
those
simple simple elements and that's
exactly what we're working with here so
it's a very exciting platform
both from a modeling perspective and
from a hardware manifestation
perspective where we can
hopefully start to to have this uh test
bed where we can explore things not just
related to neuroscience but also related
to other things
that connected to other physics like
critical phenomenon icing models things
like that so
you were asking how we simulate these
circuits
it's it's at different levels and we've
got the simple spice circuit
stuff that's no problem and now we're
building these network models based on
this more efficient leaky integrator so
we can actually
reduce every element to one differential
equation and then we can also step
through it on a much coarser time grid
so it ends up being something like a
factor of a thousand to ten thousand
speed improvement which allows us to
simulate
but hopefully up to up to millions of
neurons
um whereas before we would have been
limited to
tens 100 something like that and just
like uh simulating quantum mechanical
systems with a quantum computer so the
goal here is to understand such systems
for me
the goal is to study this as a
scientific physical system it i'm not
drawn towards turning this into an
enterprise at this point i feel
short-term applications that obviously
make a lot of money is not necessarily a
curiosity driver for you at the moment
absolutely not if you're interested in
short-term making money go with deep
learning use silicon microelectronics if
you want to understand
things like the the physics of a
fascinating system or if you want to
understand
something more along the lines of the
physical limits of what can be achieved
then i think
single photon communication
superconducting electronics
is extremely exciting what if i want to
use superconducting hardware at 4 kelvin
to mine bitcoin that's my main interest
that's that's the reason i wanted to
talk to you today i want us not i don't
know what's what's bitcoin
[Laughter]
it's a
look it up on the internet somebody
somebody told me about it i'm not sure
exactly what it is
uh so but let me ask nevertheless about
applications to machine learning okay so
what like if you look at the scale of 5
10 20 years is the is it possible
to uh before we understand the nature of
human intelligence and general
intelligence do you think we'll start
falling out of this exploration
of neuromorphic systems
ability to solve some of the problems
that the machine learning systems of
today can't solve
i'm i'm really
hesitant to over promise so
i i really don't know i also i i don't
really understand machine learning in a
lot of senses i mean
machine learning
from my perspective appears to require
that you
know
precisely what your input is
and also what your goal is you usually
have some objective function or
something like that and
that's just that's very limiting i mean
of course a lot of times that that's the
case you know there's a there's a
picture and there's a horse in it so
you're done but that's not a very
interesting problem
i think
when i when i think about intelligence
it's
it's almost defined by the ability to
handle problems where you don't know
what your inputs are going to be and you
don't even necessarily know what you're
trying to accomplish i mean
i'm not sure what i'm trying to
accomplish in this world but at all
scales yeah at all scales right i mean
so and sometimes
so i i'm i'm more drawn to
the underlying phenomena the the
the critical dynamics of this of this
system trying to understand
how
elements that you build into your
hardware
result in
emergent fascinating activity
that was very difficult to predict
things like that so but but but i got to
be really careful because i think a lot
of other people who if they found
themselves working on this project in my
shoes they would say all right what are
what are all the different ways we can
use this for machine learning actually
let me let me just definitely mention
colleague nist mike schneider he's also
very much interested
particularly in the super conducting
side of things
using the incredible speed power
efficiency also ken segall at colgate
other people working on specifically the
superconducting side of this
for machine learning and
deep feed-forward neural networks there
the advantages are obvious it's
extremely fast yeah so that's less on
the nature of intelligences and more on
various characteristics of this hardware
right yes you can use for the basic
computation as we know it today yeah and
communication one of the things that
mike schneider's working on right now is
an image classifier at a relatively
small scale i think he's targeting that
nine pixel problem where you can have
three different characters and um
you just you put in a nine pixel image
and you classify it as one of these
three three uh categories and that's
going to be really interesting to see
what happens there because
if you can show that even at that scale
you just put these images in and you get
it out and you can he thinks he can do
it i forgot if it's a nanosecond or some
extra extremely fast classification time
it's probably less it's probably 100
picoseconds or something
there you have challenges though because
the joseph's injunctions themselves the
electronic circuit is extremely power
efficient some orders of magnitude for
something more than a transistor doing
the same thing but when you have to cool
it down to 4 kelvin you pay a huge
overhead just for keeping it cold even
if it's not doing anything so
it's it you ha it has to work at big at
large scale in order to overcome that
power penalty
but that's possible it's just it's gonna
have to get that performance and this is
sort of what you were asking about
before is like how much better than
silicon would it need to be
and the answer is i don't know i think
if it's if it's just overall better than
silicon at a problem that a lot of
people care about maybe it's
image classification maybe it's face
recognition maybe it's
monitoring credit transactions i don't
know then i think it will have a place
it's not going to be in your cell phone
but it could be in your data center
uh so what about
in terms of the data center i don't know
if you're paying attention to the
various systems like um tesla recently
now announced uh dojo
which is a large scale machine learning
training system
that again the bottom like there is
probably going to be communication
between those systems
um is there something from your work on
um
the everything we've been talking about
in terms of superconductive hardware
that could be useful there oh i mean i
okay tomorrow no in the long term it
could be the whole thing it could be
nothing i i don't know but definitely
definitely
um
when you look at the so i don't know
that much about dojo my understanding is
that that's new right that's just just
coming online well
i don't i don't even know where
uh it's it hasn't come online and and
when you announce big sexy so let me
explain to you the way things work in
the in the world out there in the word
of business and marketing it's not
always clear where you are on the coming
online part of that so i don't know
where they are exactly but the vision is
from ground up to build up you know a
very very large scale
modular machine learning asic basically
hardware that's optimized for training
neural networks and of course there's a
lot of companies that are small and big
working on this kind of problem the
question is how to do it
in a modular way that uh this has very
fast communication the interesting
aspect of tesla is you have a company
that at least at this time
is so
singularly focused on solving a
particular machine learning problem
and is making obviously a lot of money
doing so because the machine learning
problem happens to be involved with
autonomous driving so you have a system
that's um
driven by an application
and that's really interesting because
you know
uh
you have maybe google working on tpus
and so on
you have all these other
companies with asics
they're usually more kind of always
thinking general
so i like it when it's driven by a
particular application because then you
can really get to the
it's like
it's somehow if you just talk broadly
about intelligence you may not always
get
to the right solutions it's nice to
couple that sometimes a specific
clear illustration of something that
requires general intelligence which for
me driving is one such case i think
you're exactly right sometimes just
having that focus on that application
brings a lot of people focuses their
energy and attention i think that so one
of the things that's appealing about
what you're saying is not just that the
application is specific but also that
the scale is big big and that the
benefit is
is also huge so financial and to
humanity right right right yeah so i
guess let me just try to understand is
the point of this dojo system to
figure out the the parameters that then
plug into neural networks and then
you don't need to retrain you you just
make copies of a certain chip that has
all the all the parameters established
or no it's straight up retraining a
large neural network over and over and
over so you have to do it once for every
new car
no no you have to uh so they do this
interesting process which i think is a
process for machine learning supervised
machine learning systems
you're going to have to do which is
you uh have a system you train your
network once it takes a long time i
don't know how long but maybe a week
okay the train
and then you deploy it on let's say
about a million cars i don't know what
the number is that part you just write
software that updates some weights in a
table and yeah okay but there's a loop
back yeah yeah okay each of those cars
run into trouble
rarely but like they they uh
they catch the edge cases of the
performance of that particular system
and then send that data back
and either automatically or by humans
that
weird edge case data is annotated and
then the network has to become smart
enough to now be able to perform in
those edge cases so has to get retrained
there's clever ways of retraining
different parts of that network but for
the most part i think they prefer to
retrain the entire thing so you have
this giant monster that kind of has to
be trained regularly i think the vision
with uh dojo is to have a very large
machine learning focused
driving focused super computer that then
is sufficiently modular they could be
scaled to other machine learning
applications right but like so they're
not limiting themselves completely to
this particular application but is this
application is the way they kind of test
this iterative process of machine
learning is
you make a system that's very dumb
deploy it
get the edge cases where it fails make
it a little smarter it becomes a little
less dumb and that where that iterative
process achieves something that you can
call intelligent or as
smart enough to be able to solve this
particular application so it has to do
with
uh
training neural networks
fast
and training neural networks that are
large but also based on an extraordinary
amount of diverse input data yeah and
that's one of the things so this does
seem like one of those spaces where
the the scale of superconducting
optoelectronics the way that um so when
you talk about the weaknesses like i
said okay well you have to cool it down
at this scale that's fine that's because
that's not that's not an too much of an
added cost most of your power is being
dissipated by the circuits themselves
not the cooling and also
you have one
centralized kind of cognitive hub if you
will
and so
when if we're talking about putting a
superconducting system in a car that's a
that's questionable do you really want
to cryostat in the trunk of everyone
your car it'll fit it's not that big of
a deal but
hopefully there's a better way right but
since this is sort of a central supreme
intelligence or something like that
and it's it needs to really have this
massive data acquisition massive data
integration
i would think that that's where
large-scale spiking neural networks with
vast communication and all these things
would would have something pretty
tremendous to offer it's not going to
happen tomorrow there's a lot of
development that needs to be done
but you know
we have to be patient with self-driving
cars for a lot of reasons we were all
optimistic that they would be here by
now and okay they are to some extent but
if we're thinking five or ten years down
the line it it's it's not
unreasonable one other thing i'll just
let me just mention
that getting into self-driving cars and
technologies that are using ai out in
the world this is something nist cares a
lot about elham dabasi is leading up a
much larger effort in ai at nist than
than my my little project
and
really
central to that mission is this concept
of trustworthiness
so
when you're going to deploy this neural
network
in every single automobile with so much
on the line you have to be able to trust
that so now how do we know how do we
know that we can trust that how do we
know that we can trust the self-driving
car or the the supercomputer that that
trained it
there's a lot of work there and there's
a lot of that going on at nist and we're
it's still early days i mean
you're familiar with the yeah yeah and
all that but there's a fascinating dance
in engineering with like safety critical
systems
there's a desire in computer science i
just recently talked to don knuth
to to uh you know for algorithms and for
systems for them to be provably correct
or provably safe and
you know this is one other difference
between humans and biological systems is
we're not provably anything right right
and so there's uh some aspect of uh
imperfection yes that we need to have
built in like robustness to imperfection
um
be part of our systems which is a
difficult thing for engineers to contend
with they're very uncomfortable with the
idea that
you have to be okay with failure and
almost engineer failure into the system
mathematicians hate it too but i think
it was i think it was turing who said
something along the lines of
i can give you an intelligent system or
i can give you a flawless system but i
can't give you both and it's in sort of
creativity and abstract thinking seem to
rely somewhat on
stochasticity and um
not
having components that perform exactly
the same way every time this is where
like the disagreement i have with not
disagreement but a different view on the
world i'm with touring
when i talk to
robotic
uh robot colleagues that sounds like i'm
talking to robots colleagues that are
roboticists
the goal is perfection
and to me is like
no i think the goal
should be um
imperfection that's communicated
and through the interaction between
humans and robots that imperfection
becomes um a feature not a bug right
like together as a seen as a system the
human and the robot together are better
than any either of them individually but
the robot itself is not perfect any in
in any way of course there's had a bunch
of disagreements including with mr elon
about
uh
to me autonomous driving is
fundamentally a human robot interaction
problem not a robotics problem right
elon is a robotics problem that's
actually an open
and fascinating question whether humans
can be removed from the loop completely
we've talked about a lot of fascinating
chemistry and
physics and engineering and we're always
running up against this issue that
nature seems to dictate what's easy and
what's hard
so you have this uh cool little paper
that i'd love to just ask you about
it's titled does cosmological evolution
select for technology
so
in physics
there's uh
parameters that seem to define the way
our universe works
that physics works that if it worked any
differently we would get a very
different world so it seems like the
parameters are very fine-tuned to the
kind of physics that we see
all the beautiful e equals mc squared
that would get these nice beautiful laws
it seems like very fine-tuned for that
so what you argue in this article
is uh it may be that the universe has
also fine-tuned its parameters
that enable the kind of technological
innovation that we see
that technology that we see can you can
you explain this idea yeah i think
you've introduced it nicely let me let
me just
try to say a few things in in
my language
leia what is what is this fine-tuning
problem so
physicists have spent centuries
trying to understand
the
the system of equations that govern the
way nature behaves the way particles
move and
interact with each other
and as that understanding has become
more clear over time
it it it became sort of evident that
it's all it's all well adjusted to allow
a universe like like we see very complex
this this large long-lived universe
and so one one answer to that is well of
course it is because we wouldn't be here
otherwise but um i don't know that's not
very satisfying that's sort of that's
what's known as the weak anthropic
principle it's a statement of selection
bias we can only observe a universe that
is fit for us
to to live in so what does it mean for a
universe to be fit for us to live in
well
the pursuit of physics it is based
partially on coming up with equations
that describe how things behave and
interact with each other
but in all those equations you have so
there's the form of the equation sort of
how how different
fields or particles um move in space and
time
but then they're also the the parameters
that
just tell you sort of the strength of
different
couplings how strongly does a charged
particle couple to the electromagnetic
field or masses how how strongly does a
a particle couple to the higgs field or
something like that
and
those parameters that define
that not not the general structure of
the equations but the
relative importance of different terms
they seem to be every bit as important
as the the structure of the equations
themselves and so i forget who it was
somebody when they were working through
this and trying to see okay if i adjust
the parameter this parameter over here
call it the say the fine structure
constant which tells us the strength of
the electromagnetic interaction oh boy i
can't change it very much otherwise
nothing works the universe sort of
doesn't it just pops into existence and
goes away in a nanosecond or something
like that and and somebody had the
phrase this looks like a put up job
meaning
every one of these parameters was dialed
in
it's arguable how
precisely they have to be dialed in but
dialed in to some extent
not just in order to enable our
existence that's a very anthropocentric
view but to enable a universe like this
one so okay maybe
i think the majority position of working
physicists in the field is
it has to be that way in order for us to
exist we're here we shouldn't be
surprised that that's the way
the universe is and i don't know for a
while that never sat well with me but i
just kind of
moved on because there are things to do
and a lot of exciting work doesn't
depend on resolving this
puzzle but
as i started working more
with technology getting into the more
recent years of my careers particularly
when i started
after having worked with silicon for a
long time which was
kind of eerie on its own but then when i
switched over to superconductor i was
just like
this is crazy
it's just absolutely um astonishing that
our universe
gives us super conductivity it's one of
the most beautiful physical phenomena
and it's also
extraordinarily useful for technology so
you can argue that the universe has to
have the parameters it does for us to
exist because we couldn't be here
otherwise but why does it give us
technology why does it give us
silicon that has this ideal oxide that
allows us to make a transistor without
trying that hard
that can't be explained by the same
anthropic reasoning
yeah so it's asking the why question i
mean the
a slight natural extension of that
question is i wonder if
the parameters were different
if we would simply have just
another set of paint brushes to create
totally other things that wouldn't like
like that wouldn't look like anything
like the technology of today but would
nevertheless have
incredible complexity which is if you
sort of zoom out and start defining
things not by
uh like how many batteries it needs and
whether it can make toast
but more like how much complexities
within the system or something like that
right well yeah you can you can start to
quantify things like you're exactly
right so
nowhere am i arguing that in all of the
vast parameter space of everything that
could conceivably exist in the
multiverse of nature
there is this one point in parameter
space where complexity arises i i doubt
it it that would be
a shameful waste of resources it seems
yeah but it might be that we reside at
one place in parameter space that has
been
um adapted through an evolutionary
process to allow us to make certain
technologies that that allow our our
particular kind of universe to arise and
sort of achieve the things it does see i
wonder if nature in this kind of
discussion if nature is a catalyst for
innovation or if it's a ceiling for
innovation so like
is it going to always limit us like what
you're talking about silicon
is it just make it super easy to do
awesome stuff in a certain dimension but
we could still do awesome stuff in other
ways it'll just be harder or it doesn't
really set like the maximum we can do
this that's a good thing to that's a
good subject to discuss i guess i feel
like we need to lay a little bit more
groundwork so i want to make sure that
um
i introduce this in the context of lee
smolin's previous idea so who's lee
smolin and what kind of ideas does he
have okay lee smolin is a theoretical
physicist who
back in the late 1980s published a paper
in the early 1990s introduced this idea
of cosmological natural selection which
argues that
the universe did evolve so his paper was
called did the universe evolve and
i gave myself the liberty of titling my
paper does cosmological selection or
just cosmological evolution select for
technology in reference to that so he
introduced that idea decades ago now he
primarily works on
um quantum gravity loop quantum gravity
other approaches to
um
unifying quantum mechanics with general
relativity as you can read about in his
most recent book i believe and he's been
on your show as well
so but i i want to introduce this idea
of cosmological natural selection
because i think that is
one of the core ideas that could change
our understanding of
how the universe got here our role in it
what technology's doing here
but there's a couple more pieces that
need to be set up first so the beginning
of our universe is
largely accepted to be the big bang and
what that means is if you look back in
time by looking
far away in space you see that um
everything used to be at at one point
and it expanded away from there there
was a uh
era in the evolutionary process of our
universe that was called inflation and
this idea was developed primarily by
alan guth and and others andre linde and
other others in the 80s
and this idea of inflation is is
basically that when a
a singularity
uh
begins this process
of of growth
there can be a temporary stage where it
just accelerates incredibly rapidly and
based on
quantum field theory this tells us that
this should produce matter in precisely
the proportions that we find of hydrogen
and helium and the big bang lithium-2
lithium also um
and other things too so so the
predictions that come out of big bang
inflationary cosmology have stood up
extremely well to empirical verification
the cosmic microwave background
things like this so
most scientists working in the field
think that
the origin of our universe is the big
bang
and i i
base all my thinking on that as well
i'm just laying this out there so that
people understand
that where i'm coming from is an
extension not a replacement of
of existing well-founded ideas
in a paper i believe it was 1986 with uh
alan guth and um another author farhy
they
they wrote that a a big bang i don't
remember the exact quote a big bang is
inextric inextricably linked with a
black hole this singularity that we call
our origin
is mathematically indistinguishable from
a black hole they're they're the same
thing
and lee smolin based his thinking uh on
that idea i i believe i don't mean to
speak for him but this is my reading of
it so
what lee smullen will say is that
a black hole in one universe is a big
bang in another universe
and this allows us to have
progeny offspring so uh
a universe can be said to have come
before another universe
and very crucially small and arguably
argues i i think this is potentially one
of the great ideas of all time that's my
opinion that
when a black hole forms it's not a
classical entity it's a quantum
gravitational entity so there it is
subject to the fluctuations that are
inherent in quantum mechanics
the
the properties that what we're calling
the parameters that describe the physics
of that system
are subject to slight mutations so that
the offspring universe does not have the
exact same parameters defining its
physics as its parent universe
they're close but they're a little bit
different and so now you have a
mechanism for
evolution
for natural selection
so this mutation so there's
and then if you think about the
the dna of the universe are the basic
parameters that govern its laws exactly
so so
that so what smallin said is our
universe results from an evolutionary
process that can be traced back some he
estimated 200 million generations
initially there was something like a
vacuum fluctuation that produced
through through random chance uh
a universe that was able to reproduce
just once so now it had one offspring
and then over time it was able to make
more and more until it evolved into a
highly structured
universe with a very long lifetime with
a great deal of complexity and
importantly especially importantly for
lee smolin
stars
stars make black holes therefore we
should expect our universe to be
optimized have its physical parameters
optimized to make very large numbers of
stars because that's how you make black
holes and black holes make offspring so
we expect our the physics of our
universe to have evolved to maximize
fecundity the number of offspring and
the way lee smolin argues you do that is
through stars that the biggest ones die
in these core collapse supernova that
make a black hole and a child
okay first of all i agree with you that
this is back to our fractal view of
everything from intelligence to our
universe
that is very compelling and a very
powerful idea
that um
unites
the origin of life and perhaps the
origin of ideas and intelligence
so from a dawkins perspective here on
earth the evolution of those
and then the evolution of the laws of
physics that led to us
i mean it's beautiful and then you
stacking on top of that that maybe we
are one of the offspring
right okay so
before getting into
where i'd like to take that idea let me
just a little bit more groundwork there
is this concept of the multiverse and it
it can be confusing different people use
the word multiverse in different ways
in
in the
multiverse
that i think is relevant to picture when
trying to grasp lee smolin's idea
essentially every
every vacuum fluctuation can be referred
to as a universe it it occurs it borrows
energy from the vacuum for some finite
amount of time and it evanesces back
into
the quantum vacuum and
ideas of uh guth before that and and
andre linday with uh
eternal inflation aren't that different
that you would expect nature
due to the the quantum properties of the
vacuum which we we know exists they're
they're measurable through things like
the casimir effect and others
you know that there are these
fluctuations that are occurring what
what smallin is arguing is that there is
this extensive multiverse that we this
universe what we can
measure and interact with is not unique
in nature
it's just our residence it's it's where
we reside and there are countless
potentially infinity
other universes other entire
evolutionary trajectories that have
evolved into things like what you were
mentioning a second ago with different
parameters and different
ways of achieving complexity and
reproduction and all that stuff so it's
not that the evolutionary process
is a funnel towards this end point not
at all just like the biological
evolutionary process that has occurred
within our universe is not
a unique route toward achieving one
specific chosen kind of species no we we
have
extraordinary diversity around us that's
what evolution does and for any one
species like us it might feel like we're
at the center of the process we're the
destination of this process but we're
just one of the many
uh nearly infinite branches of this
process and i suspect it is exactly
infinite i mean i just can't understand
how
with this idea you can never draw a
boundary around and say no the uni the
universe i mean the multiverse has ten
to the
one quadrillion components but not
infinity i don't know that that's well
yeah i have uh cognitively in incapable
as i think all of us are and
truly understanding the concept of
infinity and the concept of nothing as
well and nothing but also the concept of
a lot is pretty difficult
i could just i can count i run out of
fingers yeah at a certain point and then
you're screwed and when you're wearing
shoes and you can't even get down to
your toes it's like
it's like a thousand fine a million is
that what and then it gets crazier and
crazier right right
so this particular so when we say
technology by the way
i mean
there's some
not to over romanticize the thing but
there is some aspect about this branch
of ours that allows us to um for the
universe to know itself yes yes so to be
to to have like little
conscious cognitive fingers
they're able to feel
like to scratch the head right right
right uh to to be able to construct e
equals something squared and to
introspect to have to start to gain some
understanding of the laws that govern it
isn't that um
isn't that kind of uh amazing you know
okay i'm just human but
it feels like that if i were to build a
system that does this kind of thing that
involves laws of physics that evolves
life that involves intelligence that my
goal would be
to come up with things that are able to
think about itself
right aren't we kind of close to the
the the design specs the destination
we're pretty close i don't know i mean
i'm spending my career designing things
that i hope will think about themselves
exactly you and i aren't too far apart
on that one but then maybe that problem
is a lot harder than we imagined maybe
we need to
let's not get let's not get too far
because i want to emphasize something
that
what you're saying is isn't it
fascinating that the universe evolved
something that can be conscious reflect
on itself but
lee smolin's idea didn't take us there
remember it took us to stars
lee smullen has argued
i think right on almost every single way
that
cosmological natural selection could
lead to a universe with rich structure
and he argued that the structure the
physics of our universe is designed to
make a lot of stars so that they can
make black holes but that doesn't
explain what we're doing here in order
to in order for that to be an
explanation of us what you have to
assume is
that once you made that universe that
was capable of producing stars
life planets all these other things were
along for the ride they got lucky we're
we're kind of arising growing up in the
cracks but the universe isn't here for
us we're still kind of a fluke in that
picture and i can't i don't i don't
necessarily have like a philosophical
opposition to that stance it's just not
um
okay so
i don't think it's complete so it seems
like whatever we got going on here to
you it seems like whatever we have here
on earth
seems like a thing you might want to
select for in this whole big process
exactly so if what you are truly if your
entire evolutionary process
only cares about fecundity it only cares
about making offspring universes because
then there's going to be the most of
them in that
local region of of hyperspace which is
the set of all possible
universes let's let's say um you don't
care how those universes are made you
know they have to be made by black holes
this is what this is what inflationary
theory tells us the big bang tells us
that black holes make universes
but what if there was a technological
means to make universes stars require a
ton of matter
because they're they're not thinking
very carefully about how you make a
black hole they're just using gravity
you know
um but
if we devise technologies that can
efficiently compress matter into a
singularity
it turns out that if you can compress
about 10 kilograms into a very small
volume
that will make a black hole that is
likely highly probable to inflate into
its own offspring universe this is
according to calculations done by other
people who are professional quantum
theorists quantum field theorists and i
hope i am
grasping what they're telling me
correctly i'm somewhat of a of a
translator here but so so that's
that's the position that is particularly
intriguing to me which is that
what might have happened is that okay
this particular
branch on the vast tree of evolution
cosmological evolution now we're talking
about not biological evolution within
our universe but cosmological evolution
went through exactly the process that
elise mullin described got to the stage
where
stars were making lots of black holes
but then continued to evolve and somehow
bridged that gap and made intelligence
and intelligence capable of devising
technologies because technologies
in intelligent species working in
conjunction with technologies could then
produce
even more more efficiently more
like faster and better and more
different then you start to have
different kind of mechanisms of mutation
perhaps all that kind of stuff and so if
you do a simple calculation that says
all right if i want to
we know roughly how many
um
core collapse supernova supernovae have
resulted in black holes in our galaxy
since the beginning of the universe and
it's something like a billion so then
you would have to estimate that it would
be possible for a technological
civilization to produce more than a
billion black holes
with the energy and matter at their
disposal and so
one of the calculations in that paper
back of the envelope but i think
revealing nonetheless is that if you
take a
relatively um
common asteroid something that's about a
kilometer in diameter what i'm thinking
of is just
scrap material laying around in our
solar system
and break it up into 10 kilogram chunks
and turn each of those into a universe
then you would have made at least a
trillion
black holes
outpacing the star production rate by
some three orders of magnitude that's
one asteroid so now if you envision an
intelligent species that would
potentially have been devised initially
by humans but then based on
superconducting optoelectronic networks
no doubt and they go out and populate
they don't they don't have to fill the
galaxy they just have to get out to the
asteroid belt
they could potentially dramatically
outpace the rate at which stars are
producing offspring universes and then
wouldn't you expect that
that's where we came from instead of a
star yeah so you have to somehow become
masters of gravity so like or just
necessarily gravity so stars make black
holes with gravity but any force that
can
make the energy density
can compactify matter to produce a great
enough energy density can form a
singularity it doesn't it would not
likely be gravity it's the weakest force
you're more likely to use
something like the technologies that
we're developing for fusion for example
so i don't know um
the large ignition facility recently
blasted a pellet with
a 100 really bright lasers and
caused that to get dense enough to
engage in nuclear fusion so something
more like that or a tokamak with a
really hot plasma i'm not sure something
i don't know exactly how it would be
done i do like the idea that um
especially just been reading a lot about
gravitational waves and you know
the fact that us humans with our
technological capabilities one of the
most impressive
uh technological accomplishments of
human history is ligo being able to
precisely detect gravitational waves i'm
particularly
find appealing the idea that other alien
civilizations from
very far distances communicate
with gravity with gravitational waves
because as you become greater and
greater master of gravity which seems
way out of reach for us right now
maybe that seems like a effective way of
sending signals
especially if your job is to
manufacture black holes right
so that that so let me ask there
whatever i mean broadly thinking because
we tend to think other alien
civilizations would be very human-like
but
if we think of alien civilizations out
there as basically generators of black
holes
however they do it because they get
stars
do you think there's a lot of them in
our particular universe
out there
in our universe
well okay let me ask okay this is great
let me ask
a very generic question
and then
let's see how you answer it which is uh
how many alien civilizations are out
there
if the hypothesis that i just described
is
on the right track yes it would mean
that the parameters of our universe have
been selected
so that
intelligent civilizations will occur in
sufficient numbers
so that the if they reach
something like supreme technological
maturity let's define that as the
ability to produce black holes
then that's not a highly improbable
event it
it doesn't need to happen often because
as i just described if you get one of
them in a galaxy you're going to make
more black holes than the stars in that
galaxy
but
there's also not a super strong
motivation well
it's not obvious that you need them to
be ubiquitous throughout the galaxy
right so so one of the things that's
that i try to emphasize in that paper is
that
given this idea of of how
our parameters might have been selected
it's clear that
it's a it's a series of trade-offs right
if you make i mean in order for
intelligent life of our variety or
anything resembling us to occur you need
a bunch of stuff you need stars so
that's right back to smolin's roots of
this idea but you also need
water to have certain certain properties
you need
you need things like the
the
rocky planets like the earth to be
within the habitable zone all these
things that you start talking about in
the
um
the field of astrobiology trying to
understand life in the universe but you
can't over emphasize you can't
tune the parameters so precisely to
maximize the number of stars or to
to give water exactly the properties or
or to make rocky planets like earth the
most numerous you have to compromise on
all these things and so i think the way
to test this idea
is to look at what parameters are
necessary for for each of these
different subsystems and i've laid out a
few that i think are promising there
there could be countless others
and see how
changing the parameters
makes it more or less likely that stars
would form and have long lifetimes or
that or that rocky planets in the
habitable zone are likely to form all
these different things so we can test
how how much these things are in a tug
of war with each other
and the prediction would be that we kind
of sit at this central point where
if you if you move the parameters too
much stars aren't stable or
life doesn't form or technology's
infeasible because because life alone at
least the kind of life that we know of
cannot make black holes we don't have
this well i'm speaking for myself
you're a very fit strong person but
it might be possible for you but not for
me to compress matter so we need these
technologies but
we don't know
we have not been able to
quantify yet how um
finely adjusted the parameters would
need to be in order for silicon to have
the properties it does okay this is not
directly speaking to what you're saying
you're getting to the fermi paradox
which is where are they where are the
the life forms out there how numerous
are they that sort of thing what i'm
trying to argue is that
if this framework is is on the right
track a potentially correct explanation
for our existence
we don't it doesn't necessarily predict
that intelligent civilizations are just
everywhere because even if you just get
one of them in a galaxy which is quite
rare
it could be enough to dramatically uh
increase the fecundity of the universe
as a whole yeah and i wonder once you
start generating the offspring for
universes black holes
how that has effect on the
what kind of effect does it have on the
other
uh candidates
civilizations within that universe maybe
it has a destructive aspect or there
could be some arguments about once you
have a lot of offspring that that just
quickly accelerates to where the other
ones can't even catch up it could
but i guess
if you want me to put my chips on the
table or whatever i think i come down
more on the side
that
intelligent life civilizations
are rare
and um i guess i follow max tegmark here
and also there's there's a lot of papers
coming out recently in the field of
astrobiology that are seeming to say all
right you just worked through the
numbers on
on some modified drake equation or
something like that and it looks like
it's not improbable you wouldn't you
shouldn't be surprised that an
intelligent species has arisen in our
galaxy but if you think there's one the
next solar system over it's it's highly
improbable so i can see that the number
the probability of finding a
a civilization in a galaxy maybe it's
most likely that you're gonna find
one to a hundred or something but okay
now it's it's really important to put a
time window on that i think because
does that mean in the entire lifetime of
the galaxy before it it um
so for in our case before we run into
andromeda
i think it's highly probable i shouldn't
say i think
it's tempting to believe that it's
highly probable that in that entire
lifetime of your galaxy you're going to
get at least one intelligent species
maybe thousands or something like that
but it's also
i think um
a little bit naive to think that they're
going to coincide in time
and we'll be able to observe them and
also if you look at the
span of
life on earth the earth earth
history
it was surprising to me to kind of look
at
the amount of time well first of all the
the short amount of time there's no life
is surprising life sprang up pretty
quickly it's cellular single cell so but
that was that's the point i'm trying to
make is like so much with what
of life on earth was just like single
cell organisms like most of it most of
us like boring bacteria type of stuff
well bacteria are fascinating but i take
your point no i get it i mean no offense
to them
this kind of speaking from the
perspective of your paper of something
that's able to generate technology as we
kind of understand it that's a very
short moment in time relative to that
that full history of life on earth and
maybe our universe is just
saturated with
bacteria like
humans right
but
not the special extra
agi
super humans
that those are very rare and once those
spring up
everything just goes to like it
accelerates very quickly
yeah it's it's we just don't have enough
data to really say but i find this whole
subject extremely engaging i mean
there's this concept i think it's called
the rare earth hypothesis which is that
basically stating that
okay microbes were here right away after
the haitian era where we were being
bombarded well after yeah bombarded by
comets asteroids things like that and
also after the moon formed so once
things settled down a little bit
in a few hundred million years
you have microbes everywhere and it
could have been we don't know exactly
when it could have been remarkably brief
that that took so it does indicate that
okay life forms relatively easily i
think
that alone is sort of a checker on the
scale for the
argument that
the parameters that allow even microbial
life to form are not just a fluke but
anyway that aside yes then there was
this long dormant period not dormant
things were happening but
um important things were happening for
some two and a half billion years or
something after um the metabolic process
that releases oxygen was developed then
basically the plant is just sitting
there getting more and more oxygenated
more and more oxygenated until it's
enough that you can build these large
complex organisms and so the rare earth
hypothesis would argue that
the microbes are common in everywhere in
any planet that's like roughly in the
habitable zone and has some water on
it's probably gonna have those but then
getting to this cambrian explosion that
happened some between five and six
hundred million years ago
that's that's rare you know
and i i buy that i think that is rare so
if you say how much life is in our
galaxy i think that's probably
the right answer is that microbes are
everywhere cambrian explosion is
extremely rare
and then
but the cambrian explosion kind of went
like that where
um within a couple tens or 100 million
years
all of these body plans came into
existence and and basically all of the
body plans that are now in existence on
the on the planet
were formed in that brief window
and we've just been
shuffling around since then so then what
what
caused humans to pop out of that i mean
that could be another
extremely rare
threshold that a planet roughly in the
habitable zone with water is not
guaranteed to cross you know to me it's
fascinating for being humble like the
humans cannot possibly be the most
amazing thing that such if you look at
the entirety of the system that lease
mola and you paint that cannot possibly
be the most amazing thing that process
generates so like if you look at the
evolution what's the equivalent in the
cosmological evolution and its selection
for technology the equivalent of the
human eye or the human brain universes
that are able to do some like
they don't need the damn stars
they they're able to just do some
incredible generation of complexity fast
on like
much more than if you think about it's
like most of our universes are pretty
freaking boring
there's not much going on there's a few
rocks flying around and there's some
like apes that are just like
um
doing podcasts
on some weird planet it just seems very
inefficient if you think about like the
the amazing thing the human eye the
visual cortex can do the the brain the
nervous everything that makes us more
powerful than
single cell organisms like if there's an
equivalent of that for universes
they're like the richness of physics
that could be uh
they could be expressed through a
particular set of parameters
like
i mean
that
like for me i'm uh so from a computer
science perspective a huge fan of
cellular automata which is a nice
sort of pretty visual way to illustrate
how different laws can result in uh
drastically different levels of
complexity so like it's like yeah okay
so we're all like celebrating look our
little cellular automata is able to
generate pretty triangles and squares
and therefore we achieve general
intelligence and then there'll be like
some badass chuck norris type like
uh
universal touring machine type of
cellular automata they're able to
generate other cellular automata
that does any arbitrary level of
computation off the bat
it like those have to then exist
and then we're just like this we're just
we'll be forgotten is this the story
this is uh this podcast just entertains
a few other apes for
for a few months well i i'm kind of
surprised to hear your cynicism be no
i'm very up i i usually think of you as
like a
one who celebrates humanity in all its
forms and things like that and i i guess
i just i don't i see it the way you just
described i mean okay fif we've been
here for 13.7 billion years and you're
saying
gosh that's a long time let's get on
with the show already some other
universe could have kicked our butt by
now but
that's putting a characteristic time i
mean why is 13.7 billion a long time i
mean compared to compared to what i
guess so when i look at our universe i
see this extraordinary hierarchy that
has developed over that time so at the
beginning it was a chaotic mess of
you know some
plasma and
nothing interesting going on there and
even for the first stars to form that a
lot of
really interesting uh evolutionary
processes had to occur by evolutionary
in that sense i just mean
um taking place over extended periods of
time
and structures are forming then and then
it took that first generation of stars
in order to
produce the metals
that then can more efficiently produce
another generation of stars we're only
the third generation of stars so we
might still be pretty quick to the to
the game here so
i but i don't think i don't
okay so then so then you have these
stars now you have solar systems on
those solar systems you have
um
rocky worlds you have gas giants like
all this complexity and then you start
getting life and the
the complexity that's evolved through
the evolutionary process in in life
forms is just
it's not a a letdown to me just no no
and there's some of it is like some some
of the planets is like
icy it's like different flavors of ice
cream they're icy but there might be
water under yeah
all kinds of life forms with some
volcanoes right all kinds of weird stuff
no no i i don't uh i think it's
beautiful i think our life is beautiful
and i think it was uh designed that by
by design the scarcity of the whole
thing i think mortality as terrifying as
it is is fundamental to the whole
reason we enjoy everything no i think
it's beautiful i just think that all of
us
um conscious beings in the grand scheme
of basically every at every scale will
be completely forgotten well that's true
i think everything is transient and that
would go back to maybe something more
like
lao tzu the dao de jing or something
where it's like
yes there is nothing but change there is
nothing but emergence and dissolve and
that that's it but i just
in this picture of this hierarchy that's
developed i don't mean to say that now
it gets to us and that's the pinnacle in
fact i think
the
at a high level the story i'm trying to
tease out in my research is about okay
well so then what's the next level of
hierarchy and if in if it's
okay we're we're kind of pretty smart i
mean talking about people like lee small
and alan guth max tegmark okay we're
really smart talking about me okay we're
kind of we're we can find our way to the
grocery store or whatever but sometimes
but what's next you know i mean what if
what if there's another level of
hierarchy that grows on top of us that
is even more profoundly capable and
i mean we've talked a lot about
superconducting sensors imagine these
uh cognitive systems far more capable
than us residing
somewhere else in the solar system off
of the surface of the earth where it's
much darker much colder much more
naturally suited to them and they have
these sensors that can detect single
photons of light uh from radio waves out
to all across the spectrum to gamma rays
and just see the whole universe and they
just live in space with these massive
um
collection optics so that they what what
did they do they just look out and and
experience that that vast array of
of what's
being developed
and if you're such a system
presumably you would do some things for
fun
and the kind of fun thing i would do
somebody who likes video games is i
would create and maintain and observe
something like earth
and
so in some sense we're like all what
players
on a stage for this uh superconducting
um
cold
computing system out there i mean all of
this is fascinating to think the the
fact that you're actually designing
systems here on earth that are trying to
push this technological at the very
cutting edge and also thinking about how
does the
like the evolution of physical laws
lead us to the way we are
it's fascinating that that coupling is
fascinating it's like the ultimate
rigorous application of philosophy
to the rigorous application of
engineering
so i um jeff you're one of the most
fascinating i'm i'm so glad i did not
know much about you accept through your
work and i'm so glad we got this um
chance to talk you're
you're
one of the best explainers of
exceptionally difficult concepts um
and you're also the the speaking of like
fractal you're able to function
intellectually at all levels of the
stack which which i deeply appreciate
this was really fun you're a great
educator a great scientist it's
it's an honor that you spend your
valuable time with me it's an honor that
you would spend your time with me as
well thanks jeff
thanks for listening to this
conversation with jeff shaneline to
support this podcast please check out
our sponsors in the description
and now let me leave you with some words
from the great john carmack who surely
will be a guest on this podcast soon
because of the nature of moore's law
anything that an extremely clever
graphics programmer can do at one point
can be replicated by a merely competent
programmer some number of years later
thank you for listening and hope to see
you next time
you