ttrpc

Author	SHA1	Message	Date
Sebastiaan van Stijn	4785c70883	switch to github.com/containerd/log for logs Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2024-06-19 23:19:35 +02:00
Kevin Parsons	1b4f6f8edb	client: Fix deadlock when writing to pipe blocks Use sendLock to guard the entire stream allocation + write to wire operation, and streamLock to only guard access to the underlying stream map. This ensures the following: - We uphold the constraint that new stream IDs on the wire are always increasing, because whoever holds sendLock will be ensured to get the next stream ID and be the next to write to the wire. - Locks are always released in LIFO order. This prevents deadlocks. Taking sendLock before releasing streamLock means that if a goroutine blocks writing to the pipe, it can make another goroutine get stuck trying to take sendLock, and therefore streamLock will be kept locked as well. This can lead to the receiver goroutine no longer being able to read responses from the pipe, since it needs to take streamLock when processing a response. This ultimately leads to a complete deadlock of the client. It is reasonable for a server to block writes to the pipe if the client is not reading responses fast enough. So we can't expect writes to never block. I have repro'd the hang with a simple ttrpc client and server. The client spins up 100 goroutines that spam the server with requests constantly. After a few seconds of running I can see it hang. I have set the buffer size for the pipe to 0 to more easily repro, but it would still be possible to hit with a larger buffer size (just may take a higher volume of requests or larger payloads). I also validated that I no longer see the hang with this fix, by leaving the test client/server running for a few minutes. Obviously not 100% conclusive, but before I could get a hang within several seconds of running. Signed-off-by: Kevin Parsons <kevpar@microsoft.com>	2024-05-13 14:01:59 -07:00
Fu Wei	9c0db2b1c3	Merge pull request #152 from klihub/devel/unary-interceptor-chaining Implement support for unary interceptor chaining.	2023-09-06 09:39:17 +08:00
Krisztian Litkey	f984c9b178	client: implement UnaryClientInterceptor chaining. Add a WithChainUnaryClientInterceptor client option to allow using more that one client call interceptor which will then get chained and invoked in the order given. This should allow us to implement opentelemetry instrumentation as interceptors while allowing users to keep intercepting their client calls for other reasons at the same time. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>	2023-08-25 15:57:52 +03:00
Krisztian Litkey	8ca4110ebc	Fix comment for UserOnCloseWait. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>	2023-07-28 20:02:17 +03:00
Iceber Gu	c51165f20d	First process the pending messages in recv channel Signed-off-by: Iceber Gu <wei.cai-nat@daocloud.io>	2023-05-09 11:40:38 +08:00
Derek McGowan	471297eed9	Add recvClose channel to stream Prevent panic from closing recv channel, which may be written to after close. Use a separate channel to signal recv has closed and check that channel on read and write. Signed-off-by: Derek McGowan <derek@mcg.dev>	2023-05-08 12:26:34 -07:00
Vincent Batts	6eee73df5d	*.go: organize errors to one spot And add a little documentation. Signed-off-by: Vincent Batts <vbatts@hashbangbash.com>	2022-04-19 14:40:16 -04:00
Derek McGowan	80efa545d4	Unwrap syscall error and check Signed-off-by: Derek McGowan <derek@mcg.dev>	2022-04-07 17:11:40 -07:00
Derek McGowan	d28bc92657	Introduce streaming to client and server Implementation of the 1.2 protocol with support for streaming. Provides the client and server interfaces for implementing services with streaming. Unary behavior is mostly unchanged and avoids extra stream tracking just for unary calls. Streaming calls are tracked to route data to the appropriate stream as it is received. Stricter stream ID handling, disallowing unexpected re-use of stream IDs. Signed-off-by: Derek McGowan <derek@mcg.dev>	2022-04-07 17:11:40 -07:00
Kazuyoshi Kato	d240c5005f	Use google.golang.org/protobuf instead of github.com/gogo/protobuf This change replaces github.com/gogo/protobuf with google.golang.org/protobuf, except for the code generators. All proto-encoded structs are now generated from .proto files, which include ttrpc.Request and ttrpc.Response. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>	2022-02-16 23:11:27 +00:00
Derek McGowan	f7a2e09ef8	Fix lint issues Cleanup server test Signed-off-by: Derek McGowan <derek@mcg.dev>	2022-01-21 12:04:30 -08:00
Kevin Parsons	4f0aeb590b	client: Handle sending/receiving in separate goroutines Changes the TTRPC client logic so that sending and receiving with the server are in completely independent goroutines, with shared state guarded by a mutex. Previously, sending/receiving were tied together by reliance on a coordinator goroutine. This led to issues where if the server was not reading from the connection, the client could get stuck sending a request, causing the client to not read responses from the server. See [1] for more details. The new design sets up separate sending/receiving goroutines. These share state in the form of the set of active calls that have been made to the server. This state is encapsulated in the callMap type and access is guarded by a mutex. The main event loop in `run` previously handled a lot of state management for the client. Now that most state is tracked by the callMap, it mostly exists to notice when the client is closed and take appropriate action to clean up. Also did some minor code cleanup. For instance, the code was previously written to support multiple receiver goroutines, though this was not actually used. I've removed this for now, since the code is simpler this way, and it's easy to add back if we actually need it in the future. [1] https://github.com/containerd/ttrpc/issues/72 Signed-off-by: Kevin Parsons <kevpar@microsoft.com>	2021-10-13 17:31:34 -07:00
zounengren	81faa3ee80	replace pkg/errors from vendor Signed-off-by: Zou Nengren <zouyee1989@gmail.com>	2021-09-25 22:46:36 +08:00
Wei Fu	225de2c936	client: add UserOnCloseWait function ttrpc provides WithOnClose option for user and ttrpc will call the callback function when connection is closed unexpectedly or the ttrpc client's Close() method is called. containerd runtime plugin uses it to handle cleanup the resources created by containerd shim. But the ttrpc client's Close() is only trigger and the shim's cleanup resource callback is called asynchronously, which might make part of resources leaky. There is an example from containerd-runtime-v2 for runc: ```happy [Task.Delete goroutine] [cleanupCallback goroutine] call ttrpc client.Close() --> read bundle and call runc delete delete bundle ``` If the cleanupCallback is called after deleting bundle, the callback will fail to call runc delete. If there is any running processes, the resource becomes leaky. ```unhappy [Task.Delete goroutine] [cleanupCallback goroutine] call ttrpc client.Close() --> delete bundle failed to read bundle and call runc delete ``` In order to avoid this, introduces the UserOnCloseWait to make sure that the cleanupCallback has been called synchronously, like: ``` [Task.Delete goroutine] [cleanupCallback goroutine] call ttrpc client.Close() --> wait for callback read bundle and call runc delete <-- finish sync delete bundle ``` Signed-off-by: Wei Fu <fuweid89@gmail.com>	2020-09-07 23:09:55 +08:00
blade	df116954de	fix bug, failed to assert net error due to error wrap Signed-off-by: blade <blade.shen@ucloud.cn>	2020-08-07 00:57:23 +08:00
Michael Crosby	d4834b09f5	Revert "Copy codes and status from grpc project" This reverts commit `f02233564f`. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-10-28 14:46:51 -04:00
Michael Crosby	f02233564f	Copy codes and status from grpc project This copies the codes and status package from grpc as it is the only references to the grpc project from ttrpc. This will help ensure that API breaking changes in grpc do not affect ttrpc. Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-10-28 12:33:50 -04:00
Wei Fu	6e416eafd2	return ErrClosed if read: connection reset by peer When call server.Close(), server will close all listener and notify flighting-connection to shutdown. Connections are closed asynchronously. In TestClientEOF, client can send request into closing-connection. But the read for reply will return error if the closing-connection is shutdown. In this case, we should filter error for client side about `read: connection reset by peer`. Signed-off-by: Wei Fu <fuweid89@gmail.com>	2019-10-21 19:18:01 +08:00
Michael Crosby	0e0f228740	Handle ok status Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-08-28 11:45:14 -04:00
Sebastiaan van Stijn	17f4d32234	Client.Call(): do not return error if no Status is set (gRPC v1.23 and up) To account for `5da5b1f225`, which is part of gRPC v1.23.0 and up, and after which gRPC no longer sets a Status if no error occured. Signed-off-by: Sebastiaan van Stijn <github@gone.nl>	2019-08-26 19:18:48 +02:00
Phil Estes	1fb3814edf	Merge pull request #42 from crosbymichael/client Refactor close handling for ttrpc clients	2019-06-13 14:33:16 -04:00
Michael Crosby	694de9d955	metadata as KeyValue type Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-06-13 18:06:27 +00:00
Michael Crosby	3afb82bd27	Fix error handling with server shutdown Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-06-13 17:19:47 +00:00
Michael Crosby	f3eb35b158	Refactor close handling for ttrpc clients Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-06-13 16:52:46 +00:00
Michael Crosby	de8faac08b	Add godocs for interceptors Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-06-12 20:26:36 +00:00
Michael Crosby	819653f40c	Add client and server unary interceptors Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2019-06-07 15:49:42 +00:00
Maksym Pavlenko	04523b9d2c	Rename headers to metadata Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-05-24 17:02:38 -07:00
Maksym Pavlenko	5926a92b70	Support headers Signed-off-by: Maksym Pavlenko <makpav@amazon.com>	2019-05-23 13:34:04 -07:00
Lantao Liu	ba15956d22	Make onclose an option. Signed-off-by: Lantao Liu <lantaol@google.com>	2019-04-11 10:57:14 -07:00
Phil Estes	6914432707	Merge pull request #33 from JoeWrightss/patch-1 Fix returns error message	2019-02-10 20:22:30 -08:00
zhoulin xie	ce5c1c4546	Fix returns error message Signed-off-by: zhoulin xie <zhoulin.xie@daocloud.io>	2019-02-08 22:30:00 +08:00
Brian Goff	a364f44e55	Add support for request timeout propgation. Adds a new field to the `Request` type which specifies a timeout (in nanoseconds) for the request. This is propagated on method dispatch as a context timeout. There was some discussion here on supporting a broader "metadata" field (similar to grpc) that can be used for other things, but we ended up with a dedicated field because it is lighter weight and expect it to be used pretty heavily as is.... metadata may be added in the future, but is not necessary for timeouts. Also discussed using a deadline vs a timeout in the request and decided to go with a timeout in order to deal with potential clock skew between the client and server. This also has the side-effect of eliminating the protocol/wire overhead from the request timeout. Signed-off-by: Brian Goff <cpuguy83@gmail.com>	2019-01-07 12:43:52 -08:00
Michael Crosby	d77f111e2e	Add client side context.Done support Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2018-08-28 10:54:57 -04:00
Michael Crosby	0690b20898	Add apache license to files Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2018-06-27 17:49:06 -04:00
Arnaud Rebillout	87ac4c6f7a	Log with sirupse/logrus to avoid a circular dependency to containerd #6 Signed-off-by: Arnaud Rebillout <arnaud.rebillout@collabora.com>	2018-02-22 13:28:57 +07:00
Michael Crosby	042635eccb	Add onclose func Signed-off-by: Michael Crosby <crosbymichael@gmail.com>	2018-02-05 16:51:41 -05:00
Stephen Day	4d1bf6563c	Merge pull request #20 from stevvooe/pump-read-block ttrpc: refactor channel to take a conn	2018-01-16 15:50:42 -08:00
Stephen J Day	c575201d9a	ttrpc: refactor channel to take a conn Signed-off-by: Stephen J Day <stephen.day@docker.com>	2018-01-16 15:29:07 -08:00
Stephen J Day	2c96d0a152	ttrpc: return correct error on (Client).Close Because `shutdownErr` will likely be `nil` in the close select branch, returning it to waiters will result in the waiting `(Client).Call` returning `(nil, nil)`. This should take whatever is set for the client as the exit condition, which is likely to be `ErrClosed`. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2018-01-12 16:21:27 -08:00
Stephen J Day	e963fd5a12	ttrpc: return ErrClosed when client is shutdown To gracefully handle scenarios where the connection is closed or the client is closed, we now set the final error to be `ErrClosed`. Callers can resolve it through using `errors.Cause` to detect this condition. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2018-01-09 14:46:02 -08:00
Stephen J Day	5859cd7b45	ttrpc: return buffers to pool Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-11-29 21:32:38 -08:00
Stephen J Day	b774f8872e	ttrpc: refactor client to better handle EOF The request and response requests opened up a nasty race condition where waiters could find themselves either blocked or receiving errant errors. The result was low performance and inadvertent busy waits. This refactors the client to have a single request into the main client loop, eliminating the race. The reason for the original design was to allow a sender to control request and response individually to make unit testing easier. The unit test has now been refactored to use a channel to ensure that requests are serviced on graceful shutdown. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-11-29 21:00:50 -08:00
Stephen J Day	2a1ad5f6c7	ttrpc: increase maximum message length This change increases the maximum message size to 4MB to be inline with the grpc default. The buffer management approach has been changed to use a pool to minimize allocations and keep memory usage low. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-11-29 13:30:41 -08:00
Stephen J Day	b1feeec836	ttrpc: implement Close and Shutdown This apples logic to correctly Close a server, as well as implements graceful shutdown. This ensures that inflight requests are not interrupted and works similar to the functionality in `net/http`. This required a fair bit of refactoring around how the connection is managed. The connection now has an explicit wrapper object, ensuring that shutdown happens in a coordinated fashion, whether or not a forceful close or graceful shutdown is called. In addition to the above, hardening around the accept loop has been added. We now correctly exit on non-temporary errors and debounce the accept call when encountering repeated errors. This should address some issues where `SIGTERM` was not honored when dropping into the accept spin. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-11-29 11:03:51 -08:00
Stephen J Day	bdb2ab7a81	ttrpc: use odd numbers for client initiated streams Following the convention of http2, we now use odd stream ids for client initiated streams. This makes it easier to tell who initiates the stream. We enforce the convention on the server-side. This allows us to upgrade the protocol in the future to have server initiated streams. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-11-27 18:18:25 -08:00
Stephen J Day	07cd4de2f2	ttrpc: correctly propagate error from response Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-11-22 12:06:18 -08:00
Stephen J Day	7f752bf263	ttrpc: handle concurrent requests and responses With this changeset, ttrpc can now handle mutliple outstanding requests and responses on the same connection without blocking. On the server-side, we dispatch a goroutine per outstanding reequest. On the client side, a management goroutine dispatches responses to blocked waiters. The protocol has been changed to support this behavior by including a "stream id" that can used to identify which request a response belongs to on the client-side of the connection. With these changes, we should also be able to support streams in the future. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-11-21 21:38:38 -08:00
Stephen J Day	2a81659f49	ttrpc: remove use of typeurl Rather than employ the typeurl package, we now generate code to correctly allocate the incoming types from the caller. As a side-effect of this activity, the services definitions have been split out into a separate type that handles the full resolution and dispatch of the method, incuding correctly mapping the RPC status. This work is a pre-cursor to larger protocol change that will allow us to handle multiple, concurrent requests. Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-11-21 18:03:52 -08:00
Stephen J Day	f147d6ca77	ttrpc: rename project to ttrpc Signed-off-by: Stephen J Day <stephen.day@docker.com>	2017-11-15 17:04:16 -08:00

1 2

51 Commits