Monday, June 6, 2011

Nature

"I think that I shall never see, a poem lovely as a tree"
-Alfred Joyce Kilmer

It's often said the beauty of nature far surpasses that of man. With the goal of this site being to generate pretty pictures algorithmically, I thought it would be interesting to think about how nature works from this perspective, so we can shamelessly rip off its ideas.

There are a small amount of types

Nature seems to work with "alphabets", ie a small number of types that combine together at different levels to produce different behaviour.

Eg quarks combine to form electrons/protons/neutrons which combine to form the chemical elements. I don't think this is just how science has been modelled by humans, but rather a necessary part of its ability to generate complex outcomes like the universe & nature and intelligent beings like us actually thinking and experiencing it all.


More alphabets: 4 letters in DNA, combinations of which can be read as instructions to make a limited number of amino acids, which are again combined to make proteins, which make cells, which make up living things.

Invented examples include alphabets (letters, words, sentences) and binary code (bits which can encode data and assembly instructions, how large programs are composed of smaller parts like functions).

It's interesting to ponder whether things "have to be this way". We know from Chinese that there is a different way to represent words with symbols than one used in the English-style alphabet, and while the graphical representation may arguably look better than letters making up words, they don't compress as well. That letters work at all seems to indicate humans only speak with a limited number of sounds as well. The fact that humans and nature both use this seems to indicate it's a good design decision (not forgetting humans are part of nature, so it's discovered to be useful yet again, in a system running another level up)

Complexity comes from iterations, interactions and feedback loops

If there are a small amount of types, and a small amount of laws - where does the complexity come from? The answer (in our terms) is the sheer amount of interactions and calculations! The numbers are enormous:

-3 billion nucleotides (letters) in human genome
-trillions of atoms in a cell
-trillions of cells in an animal
-billions of galaxies (a fun simulator)
-universe is billions of years old

Instructions and data are one

Letters in DNA (after a few processes) are interpreted as instructions for a molecular machine to produce certain proteins. But that abstraction is leaky - the "letters" are not just instructions, they are also atoms, and so have certain charges and properties. They can interact with other atoms, causing all kinds of interactions, including generating more sequences which can float free away from the genome, to interact with other machines and processes.

Some sequences have atomic shapes and charges which cause them to fold up into machines of many varieties, including a ribosome which interprets sequences to generate proteins. This generates even more complicated structures, which again can interact.

Building up from there you create ever more complex molecular machinery, including ones that alter the sequence in various ways (fix / snip / replace / replicate / mutate) or alter, destroy, enable etc other machines on the way to finally generating biological outcomes. See wiki article on biological feedback

Computers store both data and assembly instructions as bits in memory. Some computer languages (eg Lisp and other dynamic languages) allow you to manipulate both data and code, which can be extremely powerful. I think the amount of power in this is extremely large, and currently very underutilised.

Output

Try to imagine the amount of computation required to perform the following:

A universe worth of particles interact with each other. Some form replicating patterns, converting inert atoms to replicating ones. These instructions, which execute atop the laws of physics can interpret and modify themselves, and some are better at utilising resources and replicating themselves than others. The best ones spread, and variety causes the better instructions to occupy all available niches, which of course shift as they are all interacting and competing with each other.

Run this simulation for billions, and billions of years. What's the output? If you hit a breakpoint - the code would be the current genetic diversity of all living creatures of earth.

The data is the organic matter in the soil, the atmosphere with free oxygen and all of the atoms arranged into the flowers, feathers, leaves, seeds, hair, teeth, flesh and fins of life which covers the surface of the planet - and all of this beauty and unimaginable complexity emerged from the repeated application of relatively simple algorithmic processes.

Coding goal

Construct a program that has an "alphabet which is interpreted and executed or run on a simulation that can build something that can run over the instructions again and produce something else"

I think this feedback loop would be enough to kick off an evolving system that uses its own success as a fitness selector (so genetic algorithms to running for a while would push out the complexity (hopefully!))

Implementation would be a tree of characters (letters/symbols etc) which represent represent strings of javascript (which can be copied then slightly altered in various ways (eg swapping out variables, altering stack positions etc, altering program in other ways))

4 comments:

  1. Dude I love the way you think! Great ideas and concepts I follow you. I thought about these concepts while in a biology class in college. Let me know what comes of your discoveries (^.^)b! I would love to see or even help you with this goal.

    ReplyDelete
  2. Hi Jeb, this really fascinates me too, and I've recently changed from Java programmer to Bioinformatician.

    Biologists could do really well learning to code, and coders could learn from biology too. At the moment I'm ridiculously busy coding Perl scripts, but one day I hope to come back to this!

    ReplyDelete
  3. http://www.infoq.com/presentations/We-Really-Dont-Know-How-To-Compute

    ReplyDelete
  4. more thoughts:

    Biology is the act of trying to debug life, the most complex code in the universe.

    Debugging a computer program quickly teaches you a deep respect for empiricism.

    People used to think that bits of DNA they didn't understand were 'Junk DNA', but more and more activity, networks & processes are found in it. DNA is well conserved over massive amounts of replication at great expense, it is unlikely nature wouldn't have optimised it away.

    http://en.wikipedia.org/wiki/Kolmogorov_complexity says that complexity of a string is measured by how hard it is to compress. Or the reverse, the higher the kolmogorov complexity, the better machine you can build with the least amount of resources. Nature is the ultimate demoscene.

    DNA is of course very hard to compress (nature uses -O11) and the low-complexity areas are probably shaped that way for a higher-level structural reason (ie affecting overall charge of the long strand of DNA and deciding how it will curl up into the shape of a chromosome) so there's more information there, just another step higher up in the system.

    ReplyDelete