BioSec 2012: Elizabeth

From Soma-notes

Elizabeth's BioSec Notes

(Organized by class dates) Brain dumps, class notes, useful insights, points of confusion, it's all here.

Jan 25

Class readings:

Chapter 2: Origins of Life
Chapter 3: Selection, Biodiversity, and Biosphere

In-class notes:

  • Chemistry Review
    • energy difference between reactants and products in a chemical reaction
    • however, you need an input of energy to begin the reaction
  • a catalyst changes (lowers the energy needed to reach the intermediate state, making the reaction more likely to take place
    • the catalyst is unchanged in the process
  • Biological catalysts are enzymes (proteins) that hold the reactants and situate them in such a way that the reaction can happen more easily
    • enzymes move the reactants around
    • enzymes can have crystalline structure
  • cell logic is built on pattern-matching
    • enzyme is looking for the reactants that fit its receptors
  • ATP: Adenosine Triphosphate
    • energy carrier/source for cells
    • universal resource, used by all cells
  • ADP: Adenosine Diphosphate
    • similar to ATP, but has one fewer phosphate group
    • has lower energy than ATP
    • the cell expends energy to turn it into ATP
    • then the cell breaks up the ATP to use the stored energy
  • Eukaryotic cells vs. Prokaryotic cells
    • in eukaryotic cells, the genetic sequence isn't simply copied from DNA to RNA. Instead, parts of different sequences are picked and chosen and edited into proteins.
    • this means that a lot of the information in the DNA is there to control and regulate how parts are edited and assembled.
  • Because of how evolution works (building on what already worked), understanding how a system works is equivalent to understanding its history, and why it is the way it is.
    • however, it can be hard to know where stuff came from, and what came first

Jan 27

Class readings:

Chapter 4: Energy and Enzymes
Chapter 5: Membranes and Transport

In-class notes:

  • ATP provides the energy to shake things up, get things moving so that reactions can go
  • Fluid Membrane Model: active transport
    • steric: does it fit through the transport channel?
    • charge: does it have the right charge?
    • selectively open: channels can be open or closed
  • Chapter 6 (Cellular respiration) preview
  • respiration = the process of getting oxygen into the cell
  • glycolysis: ancient process to create ATP, doesn't involve oxygen
  • Figure 6.5 - important diagram
    • glucose oxidation happens in steps so that more energy (ATP) can be harnessed and not lost as heat (wasted energy)
  • there isn't anything particularly special about the molecules used in cellular respiration (glucose), but it is noteworthy that all eukaryotic cells use the same molecules and seem to be evolutionarily related
  • look at the process/architecture of the Calvin and Krebs cycles

Feb 1

Class readings:

Chapter 6: Cellular Respiration
Chapter 7: Photosynthesis

Pre-class notes:

Both chapters address how cells make ATP and other byproducts.

  • not a lot of discussion about how the two processes fit together (I mean, photosynthesis is the more important because it creates the glucose for cellular respiration to use?)
  • much of the in-depth chemistry was confusing
  • In Ch 7, I didn't fully understand the last section about photorespiration and how plants avoid it
    • what is the problem, really?
    • I understand that the C4 cycle resolves it

Possible application-y thoughts

  • both photosynthesis and cellular respiration involve a lot of cyclical processes (like loops, I suppose) that transform one product into another
  • the cellular structure model seems like it could be applied to computers (and is similar to what exists), but maybe the metaphor could be extended to be larger?
  • what would ATP map to in the computer world? Information output?
  • It seems that the processes are finely tuned so that most of the by-products (except energy lost in heat) get used - is there a moral in that story?

In-class notes:

  • Photorespiration
    • Carbon dioxide (CO2) is split up to get oxygen (O2) and carbon (C). The carbon is used to make glucose, and the oxygen is toxic to an enzyme
    • is an example of a way in which the process isn't completely specific/specialized, and has included limitations, even through selection
    • evolution isn't perfect (and has limitations), and sometimes, these get papered over and the cell goes on living with them
    • in this case, the C4 cycle has developed to handle the limitation
  • Linear Electron Chain
    • begins with photon from the sun (energy)
    • the chain of molecules that take the energy from the sun and pass it along
      • each stage uses what it can of the energy and passes the rest along
      • most of the energy ends up going to proton pumps with create ATP
  • NADP is an electron carrier molecule


Q: How would this kind of system evolve? What kind of pressures must have existed?

  • Photosynthesis is a pretty efficient process
    • however, it is less efficient than cellular respiration
    • and considerably less efficient than any process designed by humans (such as the internal combustion engine)
  • Photosynthesis and cellular respiration aren't divorced processes
    • the plant uses its glucose (from photosynthesis) in the mitochondria, to create more ATP (when the sun isn't shining)
    • but animals get their glucose from what they eat, which is then used by the mitochondria
    • plants effectively store energy in glucose (ex. maple tree sap)
  • Plants are net producers of oxygen, and net consumers of carbon dioxide
  • Humans are net producers of carbon dioxide, and net consumers of oxygen
    • Earth's atmosphere is 78% nitrogen, 21% oxygen, and only 0.03% carbon dioxide


Computer security

  • plants provide energy to almost everyone else, but why? why haven't they evolved to protect themselves from animals?
  • animals prevent the plants from consuming everything
    • by themselves, the plants are unsustainable
    • they need someone to eat them
  • so, might computer security be in need of a predator?
    • do we need to find some constant pressure to keep security on the path?
  • How can we work out a system where the pressures create better, stronger systems?
    • i.e. one where evolution will take place
    • predation addresses material imbalances
    • what are the inputs and outputs of computer security?
      • on the internet, there seems to be a lot of information, but the challenge is in parsing it into wisdom

Feb 3

Class readings:

Chapter 8: Cell Communication

In-class notes:

Hormone: messenger molecule

  • creates localized state change
  • kind of an interface to the cell
  • hormones mediate reactions

Crosstalk:

  • different hormones interfere with each other
  • a given receptor can be activated by different molecules
  • a molecule can activate different receptors
  • the network begins as a fully connected graph, and then connections are pruned away
  • crosstalk is why drugs have complicated and unpredictable side effects

We could consider the "drug discovery problem" to be equivalent to the "computer security problem".

  • Engineering challenge
    • every input is connected to every output
    • through trial and error, select for the pathways that work
  • moral of the story: there needs to be more coupling than we think in computer
    • we need to allow for feedback loops, running parallel to the main operations
  • Metabolic diseases are really receptor diseases
    • the question is "what receptor does it target?"
    • this is why viruses only affect certain tissues: the tissues where the receptors are located are affected
  • some diseases (such as avian flu) can be caught by humans from animals, but not spread between humans

Feb 8

Class readings:

None, discussion of the wiki, plans for moving forward

In-class notes:

DNA: Deoxyribonucleic acid

  • two strands, twisted around each other into a double helix
    • form is very stable, sort of like a zipper
  • C-G, A-T pairs of nucleotides
  • different ends on each strand (3' and 5')
  • the chain forms redundant representations
    • the duplication and the structure are in place in order to protect the information
    • the duplication also provides a built-in way of replicating the information

DNA - polymerase

  • attaches to DNA, ratchets itself along the nucleotides
    • proceeds from 3' to 5', only in one direction
  • copies the DNA as it goes
  • does error checking, but the duplication is still a moment of vulnerability for DNA corruption

DNA structure

  • the structure of DNA is rigid, and takes up a lot of space
  • to conserve room, DNA is wrapped around histones to form chromosomes
    • ball of string metaphor
  • when packaged like this, it is effectively "off" and can't be used
    • so different cells unpackage and use different parts of the DNA
    • chromosomes are dynamic structures, only created in duplication processes

Hydrogen bonds

  • more like magnets than glue
  • when you pull them apart, the bonds disappear, but you can then put them back together

Technology

  • technology is fundamentally destructive
  • what assumptions do you have to make about putting it back together
    • we assume we know how many pieces there are
    • we assume the pieces are unique
    • errors in recombining cause diseases (structural problems)

Telomeres

  • tails on chromosomes that tell how many times a cell can reproduce
  • linked to aging and premature aging diseases

What are the skeletons in computer science's closet?

  • emergence
    • emergent behaviour == bugs
      • the products of interactions that you didn't know about
  • computability (?)
  • is computer science really a science?
    • it doesn't seem to have the big questions of methodology that would make it a science
    • basically, is math + engineering
    • is there a larger discipline of which biology and computer science are subfields?

Feb 10

Class readings:

Chapter 9: Cell Cycles

In-class notes:

Missed class for me

Feb 15

Class readings:

Chapter 10: Genetic Recombination

In-class notes:

evolution = variability + selection

Bacterial reproduction looks like an API

  • but how do you upgrade a trillion machines that use it?
  • meiosis binds reproduction and variability
  • conjugation is a similar process to what happens in software
    • plugins
    • an updating process
  • meiosis is so complicated - so why do we have it?
    • eukaryotic organisms have so many copies of their genomes that updates make no sense
    • reinstallation is a better option
    • begin with an initial cell, and rebuild the network
    • the only upgrade path is mix & match
    • the machinery of life is set up to make these upgrades go smoothly
    • the reason for limited lifespans
      • the environment will kill you eventually, so it is better to reproduce and die on schedule
  • sexual reproduction is a gradual iterative process
    • big changes over many generations
    • wait, how do we get new species from this?
    • genome is code for full working organism
  • so when you get to a certain level of complexity, you have to move to a sexual model of reproduction
    • ubuntu cycles

Why does variation work?

  • because of the environment

Conjugation

  • bacteria have a single ring-shaped chromosome
    • sometimes also have plasmids, which are other chromosomes
  • we happen to know about the process of conjugation because there is something to observe

PCR: Polymerase Chain Reaction

  • in a test tube, you can have DNA polymerase work on its own
  • this is the process used in sequencing DNA

Transposons

  • jumping DNA
  • protein folding
  • DNA moving around
  • biology is more about subtraction than addition
    • start with everything and prune

Archaea

  • extremophiles
  • how do they manage reproduction in weird (and unfriendly) environments?
  • probably, we (and all other organisms) are the descendants of archaea

Feb 17

Class readings:

Chapter 11: Mendel, Genes, and Inheritance

In-class notes:

Genetics

  • genetics as code
  • why did Mendel choose peas?
    • why did he choose the traits to examine?
      • chose discrete, binary characteristics
      • clean patterns
    • there is some potential for deception if you choose your sample this way
      • however, it probably lets you draw some clearer conclusions from initial work and see some patterns you otherwise might not
  • this means the world is divided into Mendelian genetics and non-Mendelian genetics, but they aren't an even split
    • non-Mendelian genetics is a much bigger than Mendelian genetics, because it is essentially everything else
  • Anil: the notion of a gene is a useless concept: genes are just code

Can we see Mendelian-type patterns in code reuse?

  • Mohammed: if it works, the probability of inheritance is 1
  • diploid: 2 copies of everything, backups of everything
  • men: XY
    • the one X gene is used everywhere, so if there are problems on it, they show up
  • women: XX
    • one X is used some places, the other in some places
    • division is random

Hybrid vigor

  • inbreeding leads to weakened traits, as more copies of code are likely to have the same defects
  • selective breeding is trying to make both copies of the gene the same (so that breeding is true)

Annie: We need version control! Luc: How can we get a better understanding of what is going on in genes / chromosomes?

diploidy = having a backup copy polyploidy = having multiples

Do we want diploidy in our programs?

  • voting-based systems employ diploidy or polyploidy
    • multiple systems generate the same thing, and vote on the answers
    • n-version programming
  • unfortunately, humans tend to think in the same way, and generate the same erros
    • so, human-generated diversity tends to be a bust (
    • Elizabeth: I think this is SO COOL, how cool are people?

Making the program work != debugging

  • the idea of debugging implies that we're aiming for perfection
  • redundancy will protect against bugs
    • think of engineering a bridge: there is no way to ensure that all bugs are out of it, so there are tolerances designed in to account for those bugs
    • there is no expectation of perfection

Defensive programming

  • does checks to make sure that data (etc.) is in the proper format

In code, as in biology

  • there is lots of info that we don't understand the reasons for
  • the interactions aren't understandable
    • so throwing away code that works is a bad idea
    • instead, evolve it, change it, refactor it

Linux kernel

  • 5 million lines of code (huge!)
  • has evolved over time to be cleaner, clearer
    • it's because it has been worked on
    • people have tried to fix the small things

Problems with legacy code

  • the culture is lost
  • dead code, the paradigm is gone away
  • code is part of a human system, but when you take away the humans, the code ceases to function
  • Luc: code depends on its humans, the people who use it, keep it alive, help it to evolve
  • code as part of an ecosystem of humans

Are there lessons here about why open-source code is long-term viable?

  • people stick with the process, keep the code alive

Games

  • live (while being developed), then die after being released because of lack of engagement
    • post-release (and subsequent enthusiasm) there is a lack of community involvement
    • this is similar to biology - things live, and then die (and quickly!)

Legacy code is really a statement about the humans behind the code.

Feb 22

Reading week

Feb 24

Reading week

Feb 29

Class readings:

Chapter 12: Genes, Chromosomes, and Human Genetics

In-class notes:

March 1

Class readings:

Chapter 15: Control of Gene Expression

In-class notes:

March 7

Class readings:

Chapter 15: Control of Gene Expression

In-class notes:

March 9

Class readings:

Individual chapters Chapter 45: Population Ecology

In-class notes:

March 14

Class readings:

Individual chapters Chapter 46: Population Interactions and Community Ecology

In-class notes:

Applications to computer security

Most of Unit 2 (Chapters 4 to 8) was about the internal workings of cells, how they create energy, and how they communicate and work together. As we have discussed the applications of this kind of biology to computer security, we have been focussing on how to create computer security systems that evolve, so that they can deal with threats in changing ways. One idea we discussed in class is that security needs a predation model to drive its evolution. The predator and prey would exert pressure on each other so that neither was allowed to overrun the system.

If we were going to develop a metaphor based on cellular structure and interaction, it seems like the energy source is a key concept. But what would be the ATP of a computer system? If security and non-security were competing, what would they be competing for? What sustains security? (The internet? information?)

We usually view computer security as a sort of moral dilemma - a fight between good and evil where "good" means keeping systems running, without loss of data, and with access control, and evil refers to attacks that want to compromise information, and incapacitate systems. In this construction, it is clear who should win the fight: good should prevail over evil (as in all the best tales). If we reframe this task as an evolutionary struggle, the notion of right and wrong is dropped from the picture, and the question becomes one of survival and selection. However, do we think that this will necessarily lead to a good outcome for users of the systems we want to keep secure? The term "secure" seems to imply a certain perspective, or goal. In terms of computer security, does the idea of a predation model imply that some users will be put on the chopping block to help the security of the rest? How could a predation model be set up so that the "right" features were selected for?

In the chapter about cell communication, we discussed the differences between cellular communication and the communication that takes place in computer programs. In cellular communication, the process seems to be top down: all links are established, then some are pared away. In computer programs, the process is bottom up: links are established on an as-needed basis. My first thought is that having more links could be a security problem - if you want information to stay where it's put, not having many links seems to make sense. However, I can see that having a system with more links could allow for more feedback and could potentially better support an evolutionary system.