First ASTERICS-OBELICS Workshop, 12-14 December 2016

Europe/Rome
Sala Convegni (Casa I CAPPUCCINI)

Sala Convegni

Casa I CAPPUCCINI

Via Vittorio Veneto, 21, 00187 Roma, Italy
Description
The 1st ASTERICS – OBELICS Workshop on Data Management in Astronomy and Astroparticle Physics will take place in Rome, on 12-14 December 2016 at the Casa I CAPPUCCINI, Rome, Italy (Google map link ).

This workshop is organized in the framework of OBELICS  (Observatory E-environments LINked by common ChallengeS) work package of ASTERICS. OBELICS activities aim at encouraging common developments and adoption of common solutions for data processing, archive, analysis and access among ESFRI and world class projects in Astronomy and Astroparticle Physics, such as CTA, SKA, KM3NeT, EUCLID, LSST, EGO-Virgo, E-ELT.


This 1st ASTERICS – OBELICS Workshop addresses challenges in:
 
“Science Data Cloud & Computing models”
 
in Astronomy and Astroparticle physics
through users and e-infrastructures engagement.

 
 
Participants
  • Alessandro Costa
  • Andre Schaaff
  • Andrea Bignamini
  • Andrea Ceccanti
  • Andreas Haupt
  • Andrew Lahiff
  • Andy Lawrence
  • Antonio Masiero
  • Arpad Szomoru
  • Bart Scheers
  • Benoit DELAUNAY
  • Bob Jones
  • Bob Mann
  • Christian Neissner
  • Ciro Bigongiari
  • Cristiano Bozza
  • Cristiano Palomba
  • Cristina Knapic
  • Daniele Gregori
  • Davide Salomoni
  • Denis Bastieri
  • Des Small
  • Doina Cristina Duma
  • Domenico Giordano
  • Dominique Boutigny
  • Eric Chassande-Mottin
  • Eva Sciacca
  • Fabio Hernandez
  • Fabio Pasian
  • Fabrice Jammes
  • Fabrizio Lucarelli
  • Filippo M. Zerbi
  • Francesco Bedosti
  • Francesco Visconti
  • Fulvio Gianotti
  • Gaetano Maron
  • Gergely Sipos
  • Giorgio Urso
  • Giovanna Jerse
  • Giovanni LAMANNA
  • Hanno Holties
  • Harro Verkouter
  • Jayesh Wagh
  • Kay Graf
  • L. Angelo Antonelli
  • Liam Quinn
  • Licia Florio
  • Luca Valenziano
  • Luciano Fontana
  • Luciano Gaido
  • Luisa Arrabito
  • Manuel Delfino
  • Marco de Vos
  • Marcos Lopez
  • Marie Anne Bizouard
  • Mario David
  • Mark Kettenis
  • Markus Demleitner
  • Matteo Perri
  • Mattieu Puel
  • Maurice Poncet
  • Michael Sterzik
  • Michael Wise
  • Michele Mastropietro
  • Michele Punturo
  • Nadine NEYROUD
  • Nicholas Rees
  • Nicolas CHOTARD
  • Oliver Keeble
  • paschal coyle
  • Paul Alexander
  • Peter Couvares
  • Peter Wegner
  • Philippe DEVERCHERE
  • Piero Altoe
  • Pierre-Etienne Macchi
  • Rachid Lemrani
  • Raphael Ritz
  • Rob van der Meer
  • Sara Bertocco
  • Sauvage Marc
  • Saverio Lombardi
  • Stefano Gallozzi
  • Thomas Vuillaume
  • Tiziana Ferrari
  • Vladimir Kulikovskiy
  • Volker Guelzow
    • 11:30 12:15
      Workshop Registration Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
    • 12:15 12:25
      Welcome address 10m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Lucio Angelo Antonelli (INAF Osservatorio Astronomico di Roma)
      Slides
    • 12:25 12:50
      OBELICS & Objectives of the workshop 25m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Giovanni Lamanna (LAPP/IN2P3/CNRS)
      Slides
    • 12:50 13:50
      Networking Lunch Lunch Room (Hotel Imperiale)

      Lunch Room

      Hotel Imperiale

    • 13:50 14:00
      H2020- Astronomy ESFRI and Research Infrastructure Cluster (ASTERICS) 10m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Marco de Vos (ASTRON)
    • 14:00 14:20
      ESFRI Project Presentation - CTA 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The Cherenkov Telescope Array (CTA) ESFRI project will be briefly introduced in terms of scientific aims, instrumental configuration and organization. The main scope of this talk will be to present through the full CTA observatory data flow the main challenges in terms of data acquisition, data volume generation, transfer, processing and archive. A focus will be put on the data dissemination and data access policy and their implication in the CTA computing model concept. Challenges and impact related to an Authentication and Authorization system will be reported. Prototypes evaluation and Monte-Carlo simulation production model implementation will be presented.
      Speaker: Dr Nadine Neyroud (CTA-LAPP, Technical Director)
      Slides
    • 14:25 14:55
      Tutorial: A&A in CTA: from User Requirements towards a Research Infrastructure 30m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The CTA Authentication and Authorization Infrastructure (AAI) aims at providing access to the CTA scientific gateway, applications, services and resources based on each user is profile and category according to roles and/or access rights. In this talk, we will present the AAI user requirements and the INAF AAI infrastructure. Finally, the INAF AAI will be demonstrated showing the A&A main features and the connection with the INAF Scientific Gateway.
      Speakers: Dr Alessandro COSTA (Senior Technologist and Computer Scientist at INAF), Dr Eva SCIACCA (Researcher, INAF)
    • 15:00 15:20
      H2020 AARC2: Community Driven Developments in the Identity Space 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      This talk provides an overview on the developments in the identity space that characterise the research and education community. Federated access has established itself a secure and user-friendly approach to access management, and via eduGAIN this approach scales at global level and meets the needs of a wide range of services. Increasingly, fine grained access management is also becoming a requirement for international research collaborations,some of whom have some specific requirements that go beyond present-day capabilities and pose additional challenges to eduGAIN and more in general to federated access. To address these advanced cases, eduGAIN offers a solid foundation on which to build on top of it advanced, tailored technical solutions for research and the AARC architecture provides an approach to integrate them into a wider ecosystem of infrastructures and services for R&E. This talk focuses on the work done within the AARC project to follow a community-driven approach to propose best practices, architectural patterns and interfaces, to enable research and e-infrastructures to build interoperable Authentication and authorisation infrastructures (AAIs). The talk will also briefly report on the plan to continue AARC work in AARC2. See also: https://aarc-project.eu
      Speaker: Dr Licia Florio (AARC Project Coordinator- GEANT)
      Slides
    • 15:25 15:55
      Tutorial: How INDIGO-Datacloud brokers identities and does authentication and authorization 30m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Contemporary distributed computing infrastructures (DCIs) are not easily and securely accessible by common users. Computing environments are typically hard to integrate due to interoperability problems resulting from the use of different authentication mechanisms, identity negotiation protocols and access control policies. Such limitations have a big impact on the user experience making it hard for user communities to port and run their scientific applications on resources aggregated from multiple providers in different organisational and national domains. INDIGO-DataCloud will provide the services and tools needed to enable a secure composition of resources from multiple providers in support of scientific applications. In order to do so, an AAI architecture has to be defined that satisfies the following requirements: - Is not bound to a single authentication mechanism, and can leverage federated authentication mechanisms - Provides a layer where identities coming from different sources can be managed in a uniform way - Defines how attributes linked to these identities are represented and understood by services - Defines how controlled delegation of privileges across a chain of services can be implemented - Defines how consistent authorization across heterogeneous services can be achieved and provides the tools to define, propagate, compose and enforce authorization policies - Is mainly targeted at HTTP services, but can accomodate also non-HTTP services, leveraging token translation In this contribution, Dr Ceccanti will present the work done in the first year of the INDIGO project to address the above challenges. In particular, he will introduce the INDIGO AAI architecture, its main components and their status and demonstrate how authentication, delegation and authorisation flows are implemented across services
      Speaker: Dr Andrea CECCANTI ((INFN-CNAF))
      Slides
    • 16:00 16:20
      Tea-Coffee Break 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
    • 16:20 16:40
      LOFAR Presentation 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The LOFAR telescope is in operation since 2010 and has since than produced over twenty five petabyte of data which are stored in it's long term archive. LOFAR supports a world-wide community of researchers and is built and operated by an international collaboration involving contributions from many science institutes and four research data centers distributed across Europe. LOFAR is the first astronomical instrument to operate an archive at this scale and the experience being built up is input for the design of an SKA Science Data Center. In this presentation the focus will be on user interaction with LOFAR and its archive and in particular on the authentication & authorization mechanisms that are in place and the lessons learnt from user and operational experiences over the passed years.
      Speaker: Dr Hanno Holties (System Engineer, ASTRON)
      Slides
    • 16:45 17:15
      Tutorial: EGI A&A Demo 30m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The EGI CheckIn Service enables research communities to access EGI and community services through a central security proxy service. Researchers from home organizations that participate in one of the eduGAIN federations will be able to access services using the same credentials they are using at their home organization. Furthermore, the EGI AAI CheckIn Service supports user authentication with social media identities, enabling even those users who do not have a federated account at a home organization (such as many users that belong to the “long tail of science”). The EGI AAI CheckIn service can connect to existing community based AAIs and it can be offered as an “Identity Access Management as a Service” to those communities, which do not have or do not want to operate their own AAIs.
      Speaker: Dr Mario DAVID (LIP Lisbon, Portugal (associate researcher))
      Slides
    • 17:20 18:20
      A&A Panel Discussion : How to harmonise A&A mechanisms across the world-class projects in astrophysics and astroparticles? 1h Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Fabio Pasian (Head, Astrophysical Technologies Group INAF)
    • 09:00 09:20
      ESFRI Project Presentation - SKA 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The Square Kilometre Array (SKA) has a very demanding data management, storage and processing challenge. In this talk I will concentrate on the latter stages of the analysis pipeline which will be managed by the Science Data Processor (SKA-SDP) element of the SKA (part of the observatory infrastructure) and the SKA regional centres. The tiered model adopted by the SKA is similar to that used by CERN but has additional challenges not least of which is the data volume. The SDP element ingests data at up to 1.5 TBytes/s and averaged over a period of days must process these data into science-ready data products. The first SKA-SDP challenge is that some analysis must be performed as quickly as possible with strict latency requirements with further iterative computationally expensive processing requiring a net aggregate I/O bandwidth of order 10 TBytes/s. Data management for the SKA-SDP is a major challenge and in this talk I will discuss the architecture that the SKA-SDP is currently considering which includes a data-driven execution framework to help optimise data placement and movement between memory and several layers of persistent storage. The SKA-SDP processing stage will produce about 1 PByte of data products per day. These will then be distributed to SKA Regional Centres where further processing to produce secondary data products and other science extraction on the data will occur. At this stage interaction with science products from other observatories is essential. I will briefly discuss some of the work flows required, likely requirements for transfer to and preservation at the regional centres and how this will interact with the observatory.
      Speaker: Dr Paul Alexander (Head of Astrophysics, Cavendish Laboratory, University of Cambridge)
      Slides
    • 09:25 09:45
      Oracle Long-Term Storage Solutions 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Volumes of scientific data to be preserved for the long term keep growing at an unprecedented rate while placing ever greater demands on availability, durability, cost, access and management. Oracle addresses those challenges by providing differentiated technologies and solutions that go from enterprise-grade tape to massively scalable storage cloud services, enabling long term preservation and distribution/sharing of scientific data through a variety of hybrid architectures.
      Speaker: Mr Philippe DEVERCHERE (Oracle EMEA Storage CTO)
      Slides
    • 09:50 10:10
      OpenStack Foundation - Private and public Cloud experiences 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Since 2013 CERN is running OpenStack for its private Cloud Infrastructure which exposes the CERN compute resources for 13,000 physicists around the world to support fundamental research in High Energy Physics. With more than 7,000 servers and 190,000 cores spread over two data centres, this is one of the world's largest OpenStack private clouds. This talk will describe the history, the architecture, the tools and the technical decisions behind the CERN OpenStack Cloud Infrastructure
      Speaker: Dr Domenico GIORDANO (Computing Engineer, CERN IT Department)
      Slides
    • 10:15 10:35
      Tea-Coffee Break 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
    • 10:35 10:55
      Data Management at ESO 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The talk will explain the strategies and challenges in scientific data management for the European Southern Observatory.
      Speaker: Dr Michael F. STERZIK (Head of Data Management Operations, ESO)
      Slides
    • 11:00 11:20
      The Research Data Alliance (RDA) on practical policies 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Computer actionable policies can be used to enforce management, automate administrative tasks, validate assessment criteria, and automate scientific analyses. The RDA Practical Policies working group has put together templates of common policies as well as implementation examples of the policies in different systems illustrating best practices.
      Speaker: Dr Raphael RITZ (Max Planck Computing and Data Facility)
      Slides
    • 11:25 11:45
      A globally distributed data management solution 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The distribution, storage and analysis of the data from the LHC at CERN is a global affair. This computing platform, known as "WLCG", is described, focusing on data management and discussing the potential for reuse of its resources, components and concepts.
      Speaker: Dr Oliver KEEBLE (Team Leader, Storage Group, CERN.)
      Slides
    • 11:50 13:00
      Panel Discussion 1h 10m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Michael WISE (Head of the Astronomy Group at ASTRON)
    • 13:00 13:10
      Group Photo Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Piazza Barberini
    • 13:10 14:00
      Lunch break Lunch Room (Hotel Imperiale)

      Lunch Room

      Hotel Imperiale

    • 14:00 14:30
      Visit:MUSEO E CRIPTA DEI FRATI CAPPUCCINI Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      More info on the museum: http://www.cappucciniviaveneto.it/il_museo_3.html

    • 14:30 14:50
      LSST Presentation 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Dominique Boutigny (LSST-France Principal Investigator, LAPP)
      Slides
    • 14:55 15:25
      Tutorial: MonetDB in the context of the new (high-cadence) facilities 30m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Optical and radio telescopes planned for the near future will generate enormous data streams to meet their scientific goals, e.g., high-speed all-sky surveys, searches for rapid transient and variable sources, cataloguing the multi-millions of sources and their thousands of measurements. These high-cadence instruments challenge many aspects of contemporary data management systems. However, no database system exists yet, that keeps pace with and stores these huge amounts of scientific data, nor would it be capable of querying the data scientifically with acceptable response times. The open source relational database management system MonetDB is built upon column-store technologies which have many advantages in different Big Data science domains. MonetDB is a mature main-memory database system, compliant to the SQL2003 standard, and has APIs to C, Java, Python and R. The ease of extending its functionality with UDFs written in SQL, C, R and recently Python are other strong points. Furthermore, support of SQL management of external data makes loading of binary data, e.g., FITS files, extremely fast. With the experience and lessons learnt from high-cadence radio astronomy (LOFAR) further development was triggered to meet the database needs that characterise the optical regime where source densities are orders of magnitude larger. MonetDB is a key component in the automated full-source pipeline for the optical BlackGEM telescopes, currently under construction. In this tutorial talk I will give an overview of the properties of column stores, the fundamental differences between MonetDB and the well-known mainstream row-stores and how it is being used in astronomical pipelines and archives. I will discuss an embedded implemention of MonetDB in the experimental infrastructure of the SciLens platform, a tiered 300+ nodes locally distributed cluster focussed on massive I/O, instead of raw computing power, where remote and merge tables play a crucial role. In the context of BlackGEM, I will show examples and promising results for source cross-matching using alternative multi-dimensional tree indexes built inside the database engine.
      Speaker: Dr Bart SCHEERS (Postdoctoral researcher in the Database Architecture, CWI Amsterdam)
      Slides
    • 15:30 16:00
      Tutorial: QServ Presentation & Demo 30m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The Large Synoptic Survey Telescope (LSST) will revolutionize astronomy. Equipped with the largest camera sensor ever designed for astronomy, the telescope will allow detailed observation of the universe on a greater scale than to date. The instrument will conduct research from asteroid identification to the understanding of the nature of matter and dark energy. Operating from 2022 onwards, the processing of data produced by LSST requires computational power of tens of thousands of processors and several petabytes of data storage capacity per year. The program will run for at least a decade. Celestial objects and their physical properties are identified and cataloged in a database which will eventually include trillions of entries. With a volume in the order of several tens of petabytes, this catalog will play a major role in the scientific exploitation of data produced by the telescope. To meet these needs, a specific software called Qserv, is being developed by a team of engineers, the majority based at the American university of Stanford. This paper presents the Qserv architecture, the challenges it must meet up to, its progress and the results of tests carried out several recent yearly campaigns. The authors of this paper are part of the Qserv development team operating the testbed infrastructure that currently consists of 400 processors and 500 terabytes of storage. It is located at the computing center of IN2P3 / CNRS.
      Speaker: Fabrice Jammes (LPC-IN2P3, Clermont-Ferrand)
    • 16:05 16:25
      Tea-Coffee Break 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
    • 16:25 16:55
      SPARK Demo & Presentation 30m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      To face the increasing volume of data we will have to manage in the coming years, we are testing and prototyping implementations in the Big Data domain (both data and processing). The CDS proposes a "X-Match" service which does a cross correlation of sources between very large catalogues (1 billions rows).It is a fuzzy join between two tables of several hundred millions of lines (e.g. 470,992,970 sources for 2MASS). A user can do a cross-match of the (over 10,000) catalogues proposed by the CDS or he can upload his own table (with positions) to cross-match it with these catalogues. It is based on optimized developments implemented on a well-sized server. The area concerned by the cross-match can be the full Sky (which involves all the sources), a cone with only the sources (which are at a certain angular distance from a given position), or a HEALPix cell. This kind of treatment is potentially "heavy" and requires appropriate techniques (data structure and computing algorithm) to ensure good performances and to enable its use in online services. Apache Spark seemed very promising and we decided to improve the algorithms, by using this technology in a suitable technical environment and by testing it with large datasets. Compared to Hadoop, Spark is designed to work as much as possible in the memory. We performed comparative tests with our X-Match service and we reached an execution time better than the X-Match service. We will detail this experiment step by step and show the corresponding metrics. We will focus on the bottleneck we encountered during the shuffle phase of Spark and especially the difficulty to enable the « data co-location » which is a way to decrease the data exchange between the nodes. An illustration of how Spark works will be done through a quick demo
      Speaker: Andre Schaaff (Strasbourg astronomical Data Center (CDS))
      Slides
    • 17:00 18:00
      Panel Discussion 1h Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Volker GUELZOW (Head of IT, DESY)
    • 20:00 23:00
      Social Dinner TBD (Restaurant)

      TBD

      Restaurant

    • 09:00 09:20
      EUCLID Presentation 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The Euclid Mission is the second medium-class mission of the European Space Agency’s Cosmic Vision program. Scheduled for a launch in late 2020, it will survey the sky between the Galactic and Ecliptic planes in order to map the time-evolution of the Dark Matter distribution and to elucidate the nature of Dark Energy. To do so, it carries two instruments performing three simultaneous surveys covering 15000 square degrees of sky. VIS is a half-gigapixel visible camera that will produce images of the extragalactic sky with a spatial resolution close to that of the Hubble Space Telescope. NISP is both an imager and a slitless spectrograph equipped with 16 2k by 2k near-infrared detectors. Together they will measure shapes and colors of more than a billion galaxies and produce spectra for 50 million objects that will be used for the main cosmological goals of Euclid. Legacy science also features high in Euclid’s program as we anticipate that its final catalog will contain of the order of 10 billion objects. Furthermore, the cosmological goals of Euclid require that the space data be complemented by ground-based data, of photometric nature mostly, that cover the same area as the Euclid survey. Contrary to the space-based data, the ground-based data will be heterogeneous by nature, as it must be acquired by a variety of telescopes. The acquisition, processing and scientific exploitation of the Euclid data are therefore a challenge both at the infrastructure level (the resources needed to gather, store and process data) and at the organizational level (orchestration and tracking of the complex data processing). In this presentation, I will discuss how the Euclid Consortium, and more precisely the Science Ground Segment is building the system that will let the mission face the complexity of data handling, processing, and preservation, to ensure it can be efficiently exploited by scientist.
      Speaker: Dr Marc Sauvage (Euclid Science Ground Segment Scientist, Head of the star formation and interstellar medium group (LFEMI) at SAp.)
      Slides
    • 09:25 09:45
      H2020 Indigo DataCloud presentation 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Davide SALOMONI (Technology Director INFN, INDIGO-DataCloud coordinator)
      Slides
    • 09:50 10:20
      Tutorial: DIRAC Systems presentation 30m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      DIRAC (Distributed Infrastructure with Remote Agent Control) is a general framework for the management of tasks over distributed heterogeneous computing infrastructures. It has been originally developed to support the production activities of the LHCb (Large Hadron Collider Beauty) experiment and today is extensively used by several particle physics and biology communities. The main DIRAC components are the Workload and Data management Systems, together with a workflow engine, called ‘Transformation System’. In this talk, we will present the main functionalities of DIRAC for the workload and workflow management. Finally, we will give a demonstration of how DIRAC is used for the Monte Carlo production and analysis of CTA (Cherenkov Telescope Array).
      Speaker: Dr Luisa ARRABITO (Computer Engineer at LUPM IN2P3/CNRS)
      Slides
    • 10:25 10:45
      Tea-Coffee Break 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
    • 10:45 11:05
      ESFRI Presentation - KM3NeT 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      KM3NeT is a future European deep-sea research infrastructure hosting a new generation neutrino telescope with a volume of several cubic kilometres that - located at the bottom of the Mediterranean Sea - will open a new window on the Universe. Within this experiment, new e-needs challenges arise due to a new technical design and more data channels compared to the successful precursor experiments ANTARES, NEMO and NESTOR. At the same time as many e-infrastructures as possible should be re- and co-used, creating commons within this - and possibly a wider - scientific community. This presentation provides an overview on the e-needs of the KM3NeT project and the e-Infrastructure commons that in use and planned with a special focus on workflow management and system preservation.
      Speaker: Dr Kay GRAF (Friedrich-Alexander - Universität Erlangen-Nürnberg)
      Slides
    • 11:10 11:30
      Industry Presentation - ATOS 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Ana Juan FERRER (Head of NG Cloud Lab, Atos – Research and Innovation)
      Slides
    • 11:35 11:55
      H2020- HNSciCloud Presentation 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The work of Helix Nebula [1] has shown that is it feasible to interoperate in-house IT resources of research organisations, publicly funded e-infrastructures, such as EGI [2] and GEANT [3], with commercial cloud services. Such hybrid systems are in the interest of the users and funding agencies because they provide greater “freedom and choice” over the type of computing resources to be consumed and the manner in which they can be obtained. Propelled by the growing IT needs of the Large Hardon Collider and the experience gathered through deployments of practical use-cases with Helix Nebula, CERN is leading a H2020 Pre-Commercial procurement activity that brings together group of 10 of Europe’s leading research organisations to procure innovate IaaS level cloud services for a range scientific disciplines [4] The European Commission has committed to establish an Open Science Cloud starting in 2016 for European researchers and their global scientific collaborators by integrating and consolidating e-infrastructure platforms, federating existing scientific clouds and research infrastructures, and supporting the development of cloud-based services [5]. HNSciCloud if foreseen to contribute to the European Open Science Cloud. References: 1 http://www.helix-nebula.eu/ 2 http://www.egi.eu/ 3 http://www.geant.net/ 4 http://www.hnscicloud.eu 5 http://ec.europa.eu/newsroom/dae/document.cfm?doc_id=15266
      Speaker: Dr Robert JONES (Coordinator of the HNSciCloud Horizon 2020 Pre-Commercial Procurement project)
      Slides
    • 12:00 13:00
      Panel Discussion 1h Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Manuel Delfino (Director of the Port d’Informació Científica (PIC))
    • 13:00 14:00
      Lunch break Lunch Room (Hotel Imperiale)

      Lunch Room

      Hotel Imperiale

    • 14:00 14:20
      VIRGO Presentation 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      The first direct detection of the gravitational wave signal emitted by the coalescence of two stellar mass black holes occurred the 14 September 2015 in the two LIGO detectors and has been announced the 11th of February 2016 by the LIGO scientific collaboration and the Virgo collaboration. This achievement has been a crucial milestone of the long development path of both the detector technologies and of the analysis methodologies. The data analysis pipelines implemented in the network of gravitational wave detectors requires large and distributed computational resources to analyse a relatively small amount of data; different analysis algorithms with different requirements in terms of computing power and latency have been implemented. An overview of the LIGO-Virgo computing model is presented in this talk and few important details are focused in the next one.
      Speaker: Dr Michele PUNTURO (INFN)
      Slides
    • 14:25 14:45
      LIGO-VIRGO Collaboration 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      A large, geographically distributed scientific collaboration like LIGO-Virgo poses a number of interesting and difficult data analysis computing challenges, both technical and organizational in nature. Our scientific results depend on our ability to coordinate the activities of over 100 scientists at over 100 institutions to efficiently exploit large, complex computing resources, and our success addressing these computing challenges has directly contributed to our recent discoveries. Our data analysis computing goals, which are often in conflict, include maximizing scientific output through computational efficiency, human efficiency, technical and methodological innovation, easy of use, flexibility, and reliability. I will discuss some of LIGO-Virgo’s data analysis computing challenges, including scientific prioritization, resource allocation, optimization, development practices, distributed workflow execution, data movement, job scheduling, and accounting. We have many remaining computing challenges and much to learn from other collaborations and projects. This talk will outline some of these successes and challenges and solicit ideas for new solutions.
      Speaker: Dr Peter COUVARES (LIGO Laboratory - California Institute of Technology)
      Slides
    • 14:50 15:10
      E4 experience and expertise in HPC 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      E4 Computer Engineering is a leader in HPC and GRID Computing for Universities and Research Centres since more than 12 years. This presentation highlights some of the main installations andworking methods to ensure high reliability and the hardware of the system solutions which is used to configure the cluster. In the final section we will show studies of different types of HPC architectures (ARM and X86) until the recent design of a Power 8 cluster-based architecture achieved within the PRACE-3IP PCP European project.
      Speaker: Dr Daniele GREGORI ((E4))
      Slides
    • 15:15 15:35
      NVIDIA Presentation 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      NVIDIA, during the last decade, designed a complete hardware and software ecosystem to support the major computing challenges of the scientific community. We recently made a step forward to improve performances and user experiences of our products thanks to the following updates: i) new Pascal architecture (up to 5.3 TFlops), ii) OpenACC, CUDA and CUDA fortran programming models, iii) PGI complier free community edition, iv) domain specific libraries for deep learning (cuDNN, …), v) academic education programs, and vi) deep learning institute (DLI). Details of these topics will be presented during the talk, with particular attention to requirements of scientific projects in the field of Astronomy and Astroparticle physics.
      Speaker: Dr Piero ALTOE (Business Development Manager HPC, NVIDIA)
      Slides
    • 15:40 16:00
      Low-Power computing with ASTRI & CTA use cases 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Astronomy has always been the pathfinder among fellow sciences: It was the first natural science to emerge worldwide, in Babylonia, Asia and America, it was the first science to enter the modern era, with the first observations by Galileo Galilei, it is the first science to challenge human capabilities in the information age, as it is already operating in the realm of PetaBytes and PetaFLOPs and entering soon, with FAST and SKA, into the new age of Exa scale computing. Given the fact that the electric power that is available is limited, we are investigating the feasibility of an approach using Low-Power Computing (LPC). On the way to Peta scale, the CTA project could follow the footsteps of ASTRI, a mini array of Cherenkov telescopes, under construction in a remote site far away from human activities, in order to achieve optimal observation conditions for gamma-ray astronomy. In such a scenario, the capability of each telescope to process its own data before sending them to a central acquisition system provides a key advantage. We implemented the complete analysis chain required by a single telescope on a Jetson TK1 development board, overcoming the required real-time processing speed by more than a factor two, while staying within a very small power budget. The presentation of the architecture and the accomplished performance for ASTRI will be followed by a "what's next" in Low-Power Computing, where we show how traditional accelerators, like mainframe GPUs, could be supported or substituted by ARM processors and FPGA.
      Speaker: Dr Denis Bastieri (Padova University & INAF)
      Slides
    • 16:05 16:25
      Tea-Coffee Break 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
    • 16:25 16:45
      IBM Presentation 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      IBM’s credentials in High Performance Computing (HPC) are amongst the best in the industry. Overtime, IBM has delivered multiple, productive, reliable systems including SP/2, Blue Gene/L, Blue Gene/P and Blue Gene/Q – systems that have perennially ranked in the top echelon of the TOP500, GREEN500, and most recently, GRAPH500 benchmarks. Currently, IBM strategy is centered on OpenPower, coupled with accelerators (GPU, FPGA) connected by innovative interconnects (OpenCAPI, NVLINK). The talk will briefly describe the hardware and software components of IBM offering.
      Speaker: Dr Giorgio RICHELLI (IT Specialist - HPC and Software Defined Storage, IBM)
      Slides
    • 16:50 17:30
      Panel Discussion 40m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Giovanni Lamanna (LAPP/IN2P3/CNRS)
    • 17:30 17:50
      Conclusion 20m Sala Convegni

      Sala Convegni

      Casa I CAPPUCCINI

      Via Vittorio Veneto, 21, 00187 Roma, Italy
      Speaker: Dr Giovanni Lamanna (LAPP/IN2P3/CNRS)