Official SYCL 1.2.1 feedback thread

The Khronos Group welcomes all input from the communtiy, and invites you to leave your feedback, comments and suggestions on this thread for SYCL 1.2.1

December 6, 2017 – Embedded Vision Alliance Member Meeting – The Khronos™ Group today announces the ratification and public release of the finalized SYCL™ 1.2.1 specification. SYCL for OpenCL™ enables code for heterogeneous processors to be written in a “single-source” style using completely standard modern C++. The multi-vendor SYCL 1.2.1 standard is available royalty-free for industry use, and the full specification together with details about the SYCL open-sourced conformance test suite and Adopters Program can be found at www.khronos.org/sycl.[/FONT]
SYCL 1.2.1 is based on OpenCL 1.2, and is a major update representing two and a half years of work by Khronos members. The new specification incorporates significant experience gained from three separate implementations and feedback from developers of machine learning frameworks such as TensorFlow, which now supports SYCL alongside the original CUDA accelerator back-end.
SYCL single-source programming enables the host and kernel code for an application to be contained in the same source file, in a type-safe way and with the simplicity of a cross-platform asynchronous task graph. SYCL includes templates and generic lambda functions to enable higher-level application software to be cleanly coded with optimized acceleration of kernel code across the extensive range of shipping OpenCL 1.2 implementations. Developers program at a higher level than OpenCL C or C++, but always have access to lower-level code through seamless integration with OpenCL, C/C++ libraries, and frameworks such as OpenCV™ or OpenMP™.
While SYCL is a very generic domain-specific embedded language (DSEL) for modern C++, its unique interoperability with OpenCL also enables developers to use SYCL as a simpler way to program with existing OpenCL C/C++ or built-in kernels. SYCL can replace the Khronos cl2.hpp C++ wrapper to enable SYCL concepts such as asynchronous task graphs, and to relieve the programmer from writing cumbersome host-device transfer code. In addition, SYCL provides simplified error handling and effective compute and communication overlap between host and devices.
As well as interoperability with OpenCL, SYCL is also interoperable with OpenGL®, Vulkan®, OpenVX™, DirectX, and other vendor APIs, without memory-copy overhead. SYCL 1.2.1 can be implemented to work with a variety of existing and new C++ compilers and layers over OpenCL 1.2 implementations from diverse hardware vendors. SYCL builds on the Khronos SPIR™ 1.2 portable binary format and fully leverages the ongoing work at the Khronos OpenCL and SPIR working groups with the aim to provide long-term support for future OpenCL capabilities, including OpenCL 2.2, SPIR-V™, and Vulkan convergence.
SYCL 1.2.1 builds on the features of C++11, with additional support for C++14 and C++17, enabling ISO C++17 Parallel STL programs to be accelerated on OpenCL devices. To support this effort, Khronos is backing an open-source project to support Parallel STL on top of SYCL, running on OpenCL devices. This project is hosted at GitHub - KhronosGroup/SyclParallelSTL: Open Source Parallel STL implementation. So, while SYCL brings the power of single-source modern C++ to the OpenCL and SPIR world, it also prepares the convergence with other standards such as Khronos’ Vulkan, OpenVX and NNEF and ISO C++ (SG1, SG6, SG12, SG14).
The website SYCL.tech is a forum to allow for more community feedback on the direction and development of SYCL, to enable sharing of projects in development, and for updates on the progress of the standard. The SYCL ecosystem has enjoyed strong momentum this year with multiple implementations now including ComputeCPP and TriSYCL.

SYCL 1.2.1 Resources:

Question about cl::sycl::handler::update_host()

Is this a correct usage according to the spec:

sycl_queue->submit([&](handler &cgh)
    {
        auto ba_a = ba.get_access<access::mode::read_write>(cgh);
        auto bb_b = bb.get_access<access::mode::read_write>(cgh);

        cgh.parallel_for<class kernel>(range<1>(n), ... code omitted ... });
        cgh.update_host(ba_a);
        cgh.update_host(ba_a);
    });

And does this mean that the host side memory associated with sycl buffers ba and bb will be updated from the device memory when the kernel is complete?

kevin

This code is not legal by the spec. Specifically, SYCL 1.2.1 was clarified with the following wording:

“A command group scope in SYCL, as it is defined in Section 3.4.1, consists of a single kernel or explicit memory operation (handler methods such as copy, update_host, fill), together with its requirements.”

It is not legal in SYCL 1.2.1 to have more than one “action command” within a command group, and that’s primarily because of the question that you’re asking about the ordering semantics if there were multiple commands.