Category: Entertainment

Presentation Description

No description available.


Presentation Transcript

Slide 1: 

Synchrotron SOLEIL Alain BUTEAU : Head of Controls and Data Acquisition software group) The Data Access Challenge

Slide 2: 

Data Access Challenge Motivations and new ideas to manage the problem

NeXus files production : 

NeXus files production Mid-April 2010 : about 450 000 NeXus files in the storage facility

Slide 4: 

4 File retrieval Thanks to a unique API and a « SOLEIL standardized internal data organization », we have : developped common software solutions for all beamlines Decoupled the development of Acquisition software from Data Analysis software The COMETE library of data visualization components File browsing foxtrot Data Analysis Application NeXus Application Interface SOLEIL NeXus Files NeXus Files choice : Is it Enough ?

Slide 5: 

5 NeXus Files choice : is it enough ?

Is standardizing data format the real need ? : 

Is standardizing data format the real need ? The real needs for scientists are to : Find solutions to data format issues from the data analysis point of view ? Put in common different algorithms for analyzing data ? Find the most suitable ways to exchange data ?

The foreseen solutions are : : 

The foreseen solutions are : The foreseen solutions : Choose Nexus/HDF5 data format Why not ? But it’s not enough Define a standard internal data file structure for experimental data storage It’s a complex process, involving : Many institutes many software developers many existing data format and files But it is worthwhile doing it (so let NeXus people work hard)

From the software point of view a possible solution could be : 

From the software point of view a possible solution could be Proposal from the MAHID Group : Access data thanks to attributes/tags Define a standard way to access experimental data

What is the work to do ? : 

What is the work to do ? SOLEIL NeXus netCDF EDF Adapt each application to the Common DataAccess API Implement CommonDataModel plugin for each data format

What is the role of a CommonDataModel plugin ? : 

What is the role of a CommonDataModel plugin ? Common Data Access API SOLEIL NeXus plugin getDataItem(« pitch ») Pitch ->experiment/instrument/d13/pitch Roll ->experiment/instrument/d13/Roll * * * * Double getDataItem(« pitch ») { pitch = dictionary.get(« pitch ») return pitch *100; } Look in dictionary Nexus File Data Analysis Application Provide Data thanks to keywords Implement the standard definition of these keywords Let scientists agree on keyword definitions

Standard definition of keywords : 

Standard definition of keywords It’s of course a very long process But it has already been done by the imgCIF community (at least) for crystallography NeXus community is also working on the application definition

Is it manageable ? : 

Is it manageable ? It is a light process Developing a plugin costs a few weeks of work Adapting an application costs a few weeks of work It allows to deal with existing files It is an open process Newcomers have only to implement the standardised interface

Slide 13: 

Well , well These are beautiful ideas but where is the source code ?

Slide 14: 

Data file access abstraction in collaboration with ANSTO (Common Data Model) Project initiated by ANSTO ( The CDM API allows to create generic data analysis tools without need to known about experimental data files format(s) A set of data format plugins are already developed by each facility Current plugins: NetCDF (ANSTO), NeXus (Soleil, V1) CDM API is currently available in Java SOLEIL is already using it with our "data reduction" foxtrot application Common Data Access API

Slide 15: 

SOLEIL/ANSTO Collaboration goals and milestones First share the Common Data Model API and then the graphical components (COMETE and GumTree UI) made on top of the “CommonDataDataModel API” the graphical applications made on top of these graphical components the data analysis algorithms used in the GumTree and COMETE frameworks

Slide 16: 

Conclusion The next steps

Looking for collaboration : 

Looking for collaboration At the CommonDataModel level : Help doing the C++ port Write CDM plugins for their own institutes format At the data analysis applications level Determine the "killer applications" for the various experimental techniques (SAXS/SANS, ..) Adapt them to the CDM API First "CDM/COMETE" developer"s meeting June 30th at SOLEIL

Looking for ressources : 

Looking for ressources Budget = Ressources = developers = source code = happy scientists If Pandata has money to spend (in an intelligent way) , we think the CDM could be an interesting way of making data files transfer between institutes a reality in a near future for a cost limited and distributed among institutes

authorStream Live Help