User Interrupts A faster way to signal

18
User Interrupts – A faster way to signal Sohil Mehta [email protected]

Transcript of User Interrupts A faster way to signal

Page 1: User Interrupts A faster way to signal

User Interrupts – A faster way to signal

Sohil [email protected]

Page 2: User Interrupts A faster way to signal

Goals

• Motivation• Use cases• Software architecture• Feedback / Opens• Next steps

Page 3: User Interrupts A faster way to signal

Motivation

• Deliver events directly to user space – bypass kernel• Overcome limitations of existing IPC and I/O events

• Synchronous, Asynchronous, Polling.• Signals, pipes, RPCs, hardware notifications, etc.

• Super fast and efficient alternative

First hardware implementation – x86• Intel processor code-named Sapphire Rapids

Page 4: User Interrupts A faster way to signal

Sources of User Interrupts

• Receiver infrastructure is common• Now - User to User IPC - SENDUIPI

• RFC patches are out

• Soon - Kernel to User• Patches in development

• Later - Other sources• e.g. devices

https://lore.kernel.org/lkml/[email protected]/

Page 5: User Interrupts A faster way to signal

Use cases

Signal Pipe Eventfd User IPI (blocked)

User IPI (running)

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

16.0

18.0

20.0

Relative IPC Latency (normalized to User IPI)• General purpose fast IPC

• User mode schedulers• Event dispatch for I/O stacks

• e.g. User space networking

• Abstractions• Libevent, Liburing, etc.

• Other suggestions?

*details in backup

9x performance improvement

Page 6: User Interrupts A faster way to signal

Software design

• Opt in – via system calls• Sender – notifies the event

• User application, kernel, others

• Receiver – waiting for events• User application

• Connection setup• File descriptor based• Closest equivalent is eventfd()

user

kernel

SENDUIPI

UINTR Connection management

UINTR_FD

Receiver

UINTR_FD

Sender

Share FD

1

2

3

4

User IPI connection

Page 7: User Interrupts A faster way to signal

Receiver APIs

• Register User Interrupt Handlerint uintr_register_handler(handler_func, flags);

int uintr_unregister_handler(flags);

• Create an fd representing the vector - priorityuintr_fd = uintr_create_fd(vector, flags);

• Share Connection with sender• Any existing fd exchange mechanism via inheritance or

sharing via socket domain

Page 8: User Interrupts A faster way to signal

Sender APIs

• Sender Connection Management using FDs• Receive FD via inheritance or UNIX domain sockets

uipi_handle = uintr_register_sender(uintr_fd,flags);

int uintr_unregister_sender(uintr_fd, flags);

• x86 Instructionvoid _senduipi(uipi_handle)

Page 9: User Interrupts A faster way to signal

User IPI SampleSendervoid *sender_thread(void *arg)

{

int uipi_handle;

uipi_handle = uintr_register_sender(uintr_fd, 0);

printf("Sending IPI from sender thread\n");

_senduipi(uipi_handle);

uintr_unregister_sender(uintr_fd, 0);

return NULL;

}

Compiler – GCC 11

gcc -muintr -mgeneral-regs-only uipi_sample.c -lpthread -o uipi_sample

Output

$./uipi_sample

Receiver enabled interrupts

Sending IPI from sender thread

User Interrupt!

Receivervoid __attribute__ ((interrupt)) u_handler(struct __uintr_frame

*frame, unsigned long vector) {

write(STDOUT_FILENO, “User Interrupt!\n”, 16);

uintr_received = 1;

}

int main(int argc, char *argv[]) {

int vector = 0;

pthread_t pt;

ret = uintr_register_handler(u_handler, 0);

uintr_fd = uintr_create_fd(vector, 0);

_stui();

printf("Receiver enabled interrupts\n");

pthread_create(&pt, NULL, &sender_thread, NULL)

/* Do some other work */

while(!uintr_received);

pthread_join(pt, NULL);

close(uintr_fd);

uintr_unregister_handler(0);

exit(EXIT_SUCCESS);

}

Page 10: User Interrupts A faster way to signal

Feedback / Opens

Feedback• Syscall and FD based approach• Other architectures

Opens• Interrupting blocking system calls like sleep(), read(), etc.• Common uipi_handle across threads (and processes)• Kernel page table isolation support• Many more..

Page 11: User Interrupts A faster way to signal

Next steps?

Any suggestion is appreciated!

Please review UINTR patches on LKML.https://lore.kernel.org/lkml/[email protected]/

Page 12: User Interrupts A faster way to signal

THANK YOU

Special Thanks!Ashok Raj, Dave Hansen, Jacob Pan Jun, Tony Luck.

Page 13: User Interrupts A faster way to signal

Backup

Page 14: User Interrupts A faster way to signal

HW arch key elements

• UPID - Posted Interrupt descriptor• Pending interrupt vectors• Notification state – ON, SN.• Routing information

• UITT - Interrupt Target Table• UPID pointer• Vector information

• SENDUIPI <UITT index>

User Interrupt Target Table

Receiver task A

Interrupt

vector z

UPID A

IA32_UIPD

Sender task C

IA32_UITT

0 - UITTE for Int x

1 - UITTE for Int y

2 - UITTE for Int z

Receiver task B

UPID B

IA32_UIPD

SENDUIPI 0

SENDUIPI 1

SENDUIPI 2

Interrupt

vector yInterrupt

vector x

Page 15: User Interrupts A faster way to signal

User Interrupt delivery

• Interrupt posting• Senduipi updates UPID• Generate notification interrupt

• Notification processing• Recognize interrupt• Move to UIRR

• Interrupt delivery• Push vector, invoke handler

• Return from handler - Uiret

User Interrupt Target Table

IA32_UIRR

UPID

SENDUIPI 2

IA32_UIPD

Interrupt

vector z

z

Handler

vector z

.

.

UIRET

0 - UITTE for Int x

1 - UITTE for Int y

2 - UITTE for Int z

IA32_Handler

Sender task

Receiver task

Page 16: User Interrupts A faster way to signal

x86 Instructions

• senduipi <uipi_handle> – send a user IPI to a target task based on the UITT index.

• uiret – Return from a User Interrupt handler• clui – Mask user interrupts by clearing UIF (User

Interrupt Flag).• stui – Unmask user interrupts by setting UIF.• testui – Test current value of UIF.

Page 17: User Interrupts A faster way to signal

Kernel API for User Notification

• Exchange uintr_fd with kernel agent• Post user interrupt from the kernel

int uintr_notify(int uintr_fd);

• Kernel identifies target, vector from uintr_fd

Page 18: User Interrupts A faster way to signal

IPC Performance

Signal Pipe Eventfd User IPI (blocked)

User IPI (running)

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

16.0

18.0

20.0

Relative IPC Latency (normalized to User IPI)

Original workload: https://github.com/goldsborough/ipc-benchUpdated workload: https://github.com/intel/uintr-ipc-benchConfig: 1M ping-pong IPC notifications with message size=1

Results have been estimated based on internal tests with:Kernel: Linux v5.14.0 + User IPI patchesSystem: Internal pre-production platform

*Performance varies by use, configuration and other factors.

9x performance improvement with User IPI (running)