Annual Meeting 2018

2018 UCX and RDMA Annual Meeting

Date: December 10-12, 2018

Location: Austin, Texas

For questions please email info@ucfconsortium.org

The UCF consortium held the 2018 UCX and RDMA annual meeting at the Arm facilities in Austin, Texas. The annual meeting covered multiple topics around the UCX framework and RDMA, development, usage and futures.

Monday, 10 December, 2018

UCX – State Of the Union
Tuesday, 11 December, 2018

UCX – API Discussions
Wednesday, 12 December, 2018

RDMA CORE
8:30 AM Reception, Arm Lobby Reception, Arm Lobby Reception, Arm Lobby
9:00 AM Keynote
Future of HPC, Steve Poole, Los Alamos National Laboratory
Keynote
UCX and MPI on Astra at Sandia National Laboratory
Keynote
RDMA Core Overview, Jason Gunthorpe
10:00 AM Topic
UCX Roadmap 2019 and UCX 2.0

Abstract
Things we would like to change/optimize/cleanup in next UCP API, and backward compatibility considerations

Speaker
Pasha/ARM and Yossi/Mellanox

Topic
UCT component architecture

Abstract
Split UCT to modules, and load them dynamically, so missing dependencies would disable only the relevant transports

Speaker
Yossi/Mellanox

Topic
RDMA-CM discussion

Speaker
Yossi/Mellanox

Topic
Verbs API send/complete disaggregation API directions

Speaker
Jason Gunthorpe / Mellanox

11:00 AM Topic
Open MPI integration with UCX Updates

Speaker
Mellanox

Topic
UCP Active message API

Abstract
Discuss active messages implementation on UCP level

Speaker
Yossi/Mellanox

Topic
Verbs, DevX and DV / how UCX will be using these features

Speaker
Jason Gunthorpe/Mellanox

12:00 PM ** Working Lunch **

Topic
Regression and testing for multiple uarch (x86/Power/ARM) and interconnects ( RoCE, iWARP, TCP, etc.)

Speaker
All

** Working Lunch **

Topic
1. Multi-uarch support for various architectures,
2. Internal memcpy (DPDK style)

Speaker
All

** Working Lunch **

Topic
UCX Upstream RDMA-core support status

Speaker
Yossi/Mellanox

Topic
SELinux

Speaker
Daniel Jurgens/Mellanox

1:00 PM Topic
UCX specification and manages update

Speaker
Brad/AMD

Topic
1. Async progress for protocols,
2. Internal memcpy (DPDK style)

Abstract
Progress various protocols, such as rendezvous, stream, disconnect, RMA/AMO emulation using progress thread

Speaker
All

Topic
1. Verbs ODP MR improvements,
2. RDMA and containers

Speaker
Parav/Mellanox

2:00 PM Topic
Support for shmem signaled put

Abstract
How to support new OpenSHMEM primitive – put with signal

Speaker
Yossi/Mellanox

Topic
OpenMPI BTL over UCT

Abstract
UCT API freeze

Speaker
Nathan/Los Alamos National Laboratory

Topic
Thread safety, fine-grained locking

Abstract
Discuss what is needed in UCP and UCT to support better concurrency than a big global lock

Speaker
Yossi/Mellanox

3:00 PM Topic
MPICH with UCX – State of the union

Speaker
Ken/Argonne National Laboratory

Topic
OpenSHMEM context to UCX worker mapping

Speaker
Manju/Mellanox

Topic
Xpmem support for tag matching

Abstract
Use 1-copy for expected eager messages using UCT tag-offload API

Speaker
Yossi/Mellanox

4:00 PM Topic
OSSS SHMEM with UCX update

Speaker
Tony/Stony Brook University

Topic
UCX GPU Support.

Abstract
1. State of the Union by AMD and NVIDIA
2. Datatypes for GPU devices

Speaker
Akshay/NVIDIA, Khaled/AMD, Brad/AMD

Topic
Stream API and close protocol

Abstract
Using stream API as replacement for TCP and considerations of closing/flushing a connection

Speaker
Yossi/Mellanox

5:00 PM Topic
MVAPICH Status

Abstract
Latest development around MVAPICH

Speaker
Ohio State University

Topic
UCX Collectives

Speaker
Khaled/AMD

Topic
UCX + Python bindings

Speaker
Akshay/NVIDIA

Topic
High availability, failover

Abstract
How to implement fabric error recovery by using multiple devices/ports

Speaker
Yossi/Mellanox

6:00 PM Open Discussion Open Discussion Open Discussion