Share PowerPoint. Anywhere!

Finding our way

Uploaded from authorPOINT Lite
Download as Download Not Available PPT
Presentation Description

No description available

Like authorSTREAM?


You can vote once a day till December
10th, Vote Now!
Views: 28
Like it  ( Likes) Dislike it  ( Dislikes)
Added: January 08, 2008 This presentation is Public
Presentation Category :Education
Tags Add Tags
Presentation StatisticsNew!
Views on authorSTREAM: 26 | Views from Embeds: 2
- 2 views

Presentation Transcript

Finding our way in information space : Finding our way in information space Phil Ashworth Phil Scordis


Slide2 : UCB: The Next Generation Biopharmaceutical Leader R&D activities at 10 global sites R&D Headcount = 2,100 (August 2007) Monheim (De) Global biopharmaceutical company with specialist focus: Neurology, Inflammation and Oncology Proven sales and marketing – creating global brands Keppra®, Xyzal®, Zyrtec® Revenues of €3.5 billion in 2006 (pro forma) Successfully transformed with: Celltech acquisition in 2004 Integration of SCHWARZ PHARMA in September 2007 Over 10,000 employees across more than 40 countries Listed on EURONEXT (Brussels); current market cap of €7.5 bn


Apology : Apology Health Warning We are still in the middle of all of this, I don’t have all of the answers


History : History Research and Development in UCB Comes from integration of Schwarz Pharma, Celltech, OGS, Chiroscience, Darwin Variety of data source issues Silos, vendor systems, structured, un-structured etc. Data integration A mess of legacy approaches and many situations where no attempt has been made. To warehouse or not to warehouse? After a rollout of a research warehouse, at least two distinct examples of different working practice “break” the model Difficult to extend and rebuild warehouses. – Just another rigid system


Principles and Ideals of the Semantic Web : Principles and Ideals of the Semantic Web “The Semantic Web is an extension of the current Web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” [Tim Berners-Lee et al 2001] Ideal environment Starting from scratch, building connectivity Start defining the problem space from a blank page How applicable is this attractive approach to us? Lets find out……


The Dream : The Dream What did we want Facilitating UCB’s pipeline faster to market Better ROI, an environment in which investment in data generation can be exploited to the full. Breaking down data boundaries Major Areas for Improvement Operational Orchestration Data Integration Knowledge discovery and creation The fantasy Legacy systems remain in place where appropriate Data integration is seamless, facilitates aggregation, query based on the meaning of the data Facilitated exploration of data and exploitation of connections


Starting the journey : Starting the journey Heard of others oscillating around the semantic vs warehouse question Large investment in both technologies, building components, rolling out home built solutions Our initial investment Minimal resource Limited to vendor applications (best of breed) rather than building our own But not an all or nothing approach offered by a some Our learning curve has been steep Made many mistakes Visited many dead ends Experienced limitations first hand Had many frustrations Data Integration was our key goal


Where to start : Where to start Principles of the Semantic Web Understanding the concepts of semantics – so much reading. Semantic Technologies Differences between the semantic and OO mindsets Academia Some nice projects but, not enterprise orientated Data Integration RDF Has desirable flexibility inherent potential for integration OWL Builds on top of RDF potential for rich descriptive framework, plus the power of DL to facilitate Knowledge discovery through Reasoning Making connections But our data is in relational systems!


How to integrate: Getting RDF from RDB : How to integrate: Getting RDF from RDB RDF from RDB D2RQ Offered the ability to read/query relational databases as RDF Limitations Open source. Didn’t work on real world databases in our hands Concerns of query speed when using multiple data sources. Wanted asynchronous distributed environment Reasoning very slow across multiple data sources, Forward Chaining Cerebra server Tantalising prospect. A dead-end? Recent changes within company meant that direction for tool was uncertain. SDS – Interesting prospect (www.insilicodiscovery.com) Integrated query environment across a variety of data sources (relational, excel, web services etc.) Distributed asynchronous computing model No RDF!


How to integrate: RDF Stores / Warehouse : How to integrate: RDF Stores / Warehouse Triple stores Allegrograph – Franz. Sesame Problems Immature technology data volumes are limited wrt to life science data volumes Security and backup – primitive Limited Integration with other tools. Needed tighter integration – queries not being carried out directly in RDF stores. Again slow queries & reasoning from tools due to forward chaining. Still have data duplication issues and requirements for ETL processes One step forward, two steps back!


How to integrate: Development Tools : How to integrate: Development Tools Few professional development and deployment environments Roll your own vs the use of open source Protégé Great for model development but lacked integration with other tools (when we looked) TopBraidComposer - TopQuadrant Excellent functionality out of the box. Easy interface, File imports, navigation etc Integrated with a variety of third party systems. D2RQ, Allegrograph, Sesame, Jena, Oracle But still could not do everything we wanted it to. TopQuadrant supported our limited resource to enhance our understanding and knowledge. TopBraidLive one of the first development –> deployment applications Reasoners Several looked at - Each had their quirks None did as we thought or wanted with the data volume we had. Used Rules to achieve what we needed. Isn’t this cheating?


Stop the journey – we are getting off : Stop the journey – we are getting off We have tried to achieve data integration chasing several avenues RDF from RDB RDF warehouse Via RDB data -> txt -> RDF -> RDF Store Semantic SOA, another approach Pragmatic semantics Now we understand the messages others have been trying to pass Blowing hot and cold on the whole idea Wavering over semantic vs conventional warehousing Heavy investment in home brew technology or enterprise environment Is this a dead end?


The end : The end Thanks for coming …


Hang on, we are not giving up yet : Hang on, we are not giving up yet We decided to persevere But we still don’t have a large amount of resource to throw at this We need to take a different path Community action Collaboration There is a vibrant and active community out there W3C … Involved in direction and calling for standards


So where are we today? : So where are we today?


Driving change : Driving change TopBraidComposer - A semantic development environment using open source and limited data integration tools. Help with SDS Tighter Integration with RDF stores TQ also had to drive other vendors to provide functionality for them Many other changes as we pushed the boundaries of the tool TopBraidLive looks very promising as an easy deployment environment SDS - A data integration platform, enterprise ready, lacking a semantic direction SPARQL integration (Not just RDF from RDB, RDF from RDB, Excel, web services) We believe this is key to our future strategy Changes to their interfaces, tools and capabilities Integration with TBC UCB is driving collaborative development Helping bring companies together (A big thank you to TQ and ISD) Helping drive the community


In Summary : In Summary The semantic wave is too large to surf alone Too unpredictable to control There are some big hurdles to overcome Integration, tools, enterprise solutions, visualisation, orchestration However we are committed to helping make things happen Always on the lookout for open-minded enthusiasts Committed to contribute to the community Still believe that Semantic Technologies are part of the solution But it is not just something we can adopt (at the moment) It is still something we have to help forge so others can be adopters.


Thank you : Thank you Any Advice Questions?