Why the Tower of Babel Exists:
Why the Tower of Babel Exists Biomedical science is a bottom up enterprise
Efficiency of competitive systems
Multiple independent discovery
External enabling technology and knowledge
Tension between dissemination and control
Fundamental desire to be cited
Fundamental need to control intellectual property
Implicit citation through nomenclature
If you are using my name, you are citing my discovery
Name Space Collisions:
Name Space Collisions Molecular biology has an extraordinarily complex vocabulary
Many terms with highly specific meanings that are used rarely
Plasmid, pUC13, M13, cosmid, fosmid, yac, bac, pac, …
All cloning vectors, each with specific properties and uses
High information content per word
Compression through acronyms =andgt; collisions across domains
PCR
Polymerase Chain Reaction
Historically, MeSH indexed PCR as an abbreviation for 'premature contraction' in cardiology
Phosphocreatine in metabolism and physiology
Specific definitions with high information content
Association
Generally a rather vague relationship
In statistical genetics, a precisely defined criteria implying that specific tests for significance have been met.
Biomedical Text is Not “Well Classifiable”:
Biomedical Text is Not 'Well Classifiable' Classifiable domain
Well defined robust classes
Class definitions ~robust to algorithms and metrics
Poorly classifiable domains
Class boundaries not clear, class definitions not robust
Really just saying the best classification is one big class
Biomedical text is a web, not a collection of well defined domain specific corpuses
Is an article about P53 molecular biology, gene expression regulation or cancer biology?
Probabilistic Nature of Biomedical Knowledge:
Probabilistic Nature of Biomedical Knowledge Bayes rule
I know what I have observed
I can only probabilistically rank hypotheses
Understanding evolves as more data becomes available
Language links to understanding
As the understanding evolves, the meaning of the language evolves
Ask 3 biologists to define a gene and you will get 5 definitions and 2 dissenting opinions
Questions for Ontologies Session:
Questions for Ontologies Session How to represent probabilistic concepts and meanings with logically precise standards?
How do we associate the appropriate domain specific ontology(ies) with the text we are analyzing?
How do we create sustainable merges across evolving domain specific ontologies?