Computing in Fundamental Particles and Forces
Computing demands in the topic Fundamental Particles and Forces arise from both the experimental needs and the complexity of the phenomenology of particle physics. The requirements are driven by the high granularity of the detectors, the need for detailed modelling and simulation, and by the advances in computational methods in theory. The methods and algorithms derived are often of a more general nature, and computing thus also acts across the different topics within the programme Matter and the Universe.
It is very fortunate that in parallel with the growing demands of physics research, IT technology has also been advancing very fast; an efficient use of advanced IT technology in this environment requires perpetual development and adaption of concepts, solutions and software. Important concepts for the next years are "clouds" or even "social media" as collaborative tools or "many cores", "graphics processing units" (GPUs) or "vector units" as processing units. Classical topics such as "memory usage" continue to be highly relevant. "Big data" remains topical to physics and is fashionable on a larger scale as the concept is expanded to other research fields. A fairly new topic for the next years is the intelligent management of networks as an enabling resource rather than a mere information exchange channel; bandwidth and response times matter particularly.
Some of these concepts will be discussed in more detail in the following. Descriptions of the activities explicitly related to the LHC and the ILC are given here and there, respectively. The performance category 2 (LK-2) computing infrastructures GridKa and DESY Tier-2 that predominantly serve the experiments of high energy physics are described in a separate volume. A large-investment proposal for Tier-1 and Tier-2 computing upgrades is being prepared, and a draft is attached to this proposal in electronic form.
Since all topics mentioned above are not only relevant to particle physics, a cross-programme topic "Data management for large-scale infrastructures" inside the Research Field Matter was set up in order to facilitate sharing of knowledge and resources, see also here.
Computing for LHC and beyond
The next LHC running period will lead to substantially increased data sets and consequently requires larger data-processing and Monte Carlo production capabilities. Given this situation, the current computing model employed by the experiments needs adaptation to the new running conditions in order to avoid severe impact on data processing and prohibitively increased computing and storage costs. Since the advent of the dualcore CPUs, the number of cores has continued to increase — the latest computer models include up to 12 cores per physical CPU unit. The benefits of such multi-core architectures have so far not been utilised by experiments, and each core has been employed as a separate entity. However, if true parallelism at the CPU level can be activated, for example during specific parts of the event reconstruction, significant performance enhancement is expected. Several multi-threaded event-processing frameworks exist such as Gaudi Hive and Whiteboard, and such solutions should be explored as to their suitability to experiment software. In a similar vein, auto-vectorisation of existing algorithms may also have a significant performance impact.
The present computing models, as implemented for the first LHC running period, were developed more than 15 years ago. While the assumptions for the technology development for CPUs and storage have met the actual development, the assumptions for network speed and reliability have been by far too conservative. Therefore the paradigm of a strictly hierarchical tier model and the need for job execution local to its required data is nowadays no longer true. Revised computing models, to be implemented over the next couple of years, will deal with a much more dynamic and flexible system leading to drastic changes in the job-scheduling and data-management strategies.
The server market of the last decade has been dominated by the x86 architecture, and high energy physics was no exception. In addition to recent activities to evaluate co-processing units like GPUs to boost the executions of applications, the experiments will be faced with the introduction of different processor architectures such as ARM. These are emerging from the rapidly growing sector of mobile devices and are produced in huge quantities in a cost- and power-effective way. The experiments will need to port their software to this new architecture.
The variety of platforms, special hardware and massively growing cloud resources will create a distributed infrastructure of inhomogeneous nature. This variety can be hidden from the end-user by appropriate "middleware", which today is not available in the required form. Experiments will need to adapt and commission such systems in a joint effort with middleware developers.
The suggested solutions may require a more aggressive migration and adaptation timescale by the experiments to make computing affordable in the future. Such a change affects architectures, operating systems, compilers, software versions and languages. Such migrations are a likely part of any experiment's natural progression, but the speed of such changes may need to be accelerated.
Computing for theoretical particle physics
Grid architectures — as employed in the LHC experiments — cannot provide sufficient and adequate support for all relevant physical problems. "Large-scale simulations" in lattice gauge theories, for example, require extensive parallelisation of the code — a typical case for high-performance computing (HPC). The expensive gluon configurations are shared world-wide through the International Lattice Data Grid (ILDG). Conversely, the computing models and codes need to be optimised and adapted to next-generation hardware and algorithms, as is done in the simulation lab together with the Supercomputer Centre Jülich and the Cyprus Institute. DESY therefore also supports high performance computing efforts and uses the available resources at Jülich and other centres.
"Big data" management
The dCache technology grew out of the needs of the HERA and Tevatron experiments for fast access to their data, which at the time represented a huge inventory. The system was conceived to be flexible so as to quickly adapt to the evolving technical solutions both in mass storage for disk and tape archives. The technology is scalable and currently supports both small-scale installations and several 10 Petabyte of storage in the framework of the WLCG. The success of the dCache system largely builds on its responsiveness to user demands in storage features, where a unique characteristics is the handling of tape and disk copies of data files. Transitions between media are transparently performed according to a variety of user-de fined and automatic rules. With the availability of fast Solid State Disks (SSD), this capability will be extended to include media that are more appropriate for random access than traditional spinning disks, resulting in a significant speed-up of data analysis. Based on the data access profile, which could be "physics analysis", "event reconstruction" or "wide-area data transfers", dCache determines the most appropriate location of data within the system and schedules the corresponding transfers between the different media.
Other important areas with active dCache involvement are the support for data-transfer services and the "federation" of storage: Thanks to its close contact to the CERN Data Management group that implements the WLCG File Transfer Service (FTS), dCache provides seamless operation of world-wide WLCG data transfers, as well as continuous improvements of those services. Similar efforts are undertaken between dCache and the Globus-Online service providers, especially to allow Globus-Online to handle tape operations.
With the rapid progress in network technology, truly distributed data access becomes a viable alternative to data transfers and local storage. Such federated data storage needs to be implemented transparently, and various possibilities are currently explored in a joint effort with other research institutions like CERN, the European Grid Initiative (EGI), the Open Science Grid (OSG) in the USA, and with industry partners. The dCache collaboration is actively participating in a working group providing a world-wide federation of LHC data using a LHC specific protocol. Similarly, dCache is collaborating with CERN on an effort to provide world-wide data federations with the http/WebDAV protocol, allowing generic, non-HEP-specific open-source applications to be used in the HEP ecosystem.
With experiments collecting data over periods of several decades, "data preservation" — i.e. maintaining analysability of HEP data over long time periods and independently of the technology changes in computing platforms – becomes an important matter. In the current funding period, DESY set up a prototype system which serves as a proof-of-principle example for the aspirations of the international DPHEP collaboration. For the next funding period, the development of a software validation system at DESY will continue, attracting attention from the HEP community as a practical and concrete solution to the problem of data preservation in high energy physics. The validation system, alongside other projects such as archival storage methods or migration techniques, has placed DESY in a leading and highly visible role in this emerging field.
Grid & cloud and network development
The emerging "grid and cloud technology" is widely introduced and needs to be adapted in phase with the increase of data and technological developments. Efficient usage of the resources requires an adequate management framework, and existing management tools for running large resources have to be improved. With the computing models of the LHC experiments changing to a more federated one, the networking becomes more and more important. The LHCone network infrastructure is a successful example for the application of an intelligent management of network resources in terms of bandwidth and quality of service. These topics — and others like virtualisation and opportunistic computing – remain major challenges and will be pursued at DESY, which represents Germany on the relevant international networking groups.