mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2025-01-14 17:14:09 +00:00
adcce4d5dd
Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net>
137 lines
5.5 KiB
Plaintext
137 lines
5.5 KiB
Plaintext
Stream Parser
|
|
-------------
|
|
|
|
The stream parser (strparser) is a utility that parses messages of an
|
|
application layer protocol running over a TCP connection. The stream
|
|
parser works in conjunction with an upper layer in the kernel to provide
|
|
kernel support for application layer messages. For instance, Kernel
|
|
Connection Multiplexor (KCM) uses the Stream Parser to parse messages
|
|
using a BPF program.
|
|
|
|
Interface
|
|
---------
|
|
|
|
The API includes a context structure, a set of callbacks, utility
|
|
functions, and a data_ready function. The callbacks include
|
|
a parse_msg function that is called to perform parsing (e.g.
|
|
BPF parsing in case of KCM), and a rcv_msg function that is called
|
|
when a full message has been completed.
|
|
|
|
A stream parser can be instantiated for a TCP connection. This is done
|
|
by:
|
|
|
|
strp_init(struct strparser *strp, struct sock *csk,
|
|
struct strp_callbacks *cb)
|
|
|
|
strp is a struct of type strparser that is allocated by the upper layer.
|
|
csk is the TCP socket associated with the stream parser. Callbacks are
|
|
called by the stream parser.
|
|
|
|
Callbacks
|
|
---------
|
|
|
|
There are four callbacks:
|
|
|
|
int (*parse_msg)(struct strparser *strp, struct sk_buff *skb);
|
|
|
|
parse_msg is called to determine the length of the next message
|
|
in the stream. The upper layer must implement this function. It
|
|
should parse the sk_buff as containing the headers for the
|
|
next application layer messages in the stream.
|
|
|
|
The skb->cb in the input skb is a struct strp_rx_msg. Only
|
|
the offset field is relevant in parse_msg and gives the offset
|
|
where the message starts in the skb.
|
|
|
|
The return values of this function are:
|
|
|
|
>0 : indicates length of successfully parsed message
|
|
0 : indicates more data must be received to parse the message
|
|
-ESTRPIPE : current message should not be processed by the
|
|
kernel, return control of the socket to userspace which
|
|
can proceed to read the messages itself
|
|
other < 0 : Error is parsing, give control back to userspace
|
|
assuming that synchronization is lost and the stream
|
|
is unrecoverable (application expected to close TCP socket)
|
|
|
|
In the case that an error is returned (return value is less than
|
|
zero) the stream parser will set the error on TCP socket and wake
|
|
it up. If parse_msg returned -ESTRPIPE and the stream parser had
|
|
previously read some bytes for the current message, then the error
|
|
set on the attached socket is ENODATA since the stream is
|
|
unrecoverable in that case.
|
|
|
|
void (*rcv_msg)(struct strparser *strp, struct sk_buff *skb);
|
|
|
|
rcv_msg is called when a full message has been received and
|
|
is queued. The callee must consume the sk_buff; it can
|
|
call strp_pause to prevent any further messages from being
|
|
received in rcv_msg (see strp_pause below). This callback
|
|
must be set.
|
|
|
|
The skb->cb in the input skb is a struct strp_rx_msg. This
|
|
struct contains two fields: offset and full_len. Offset is
|
|
where the message starts in the skb, and full_len is the
|
|
the length of the message. skb->len - offset may be greater
|
|
then full_len since strparser does not trim the skb.
|
|
|
|
int (*read_sock_done)(struct strparser *strp, int err);
|
|
|
|
read_sock_done is called when the stream parser is done reading
|
|
the TCP socket. The stream parser may read multiple messages
|
|
in a loop and this function allows cleanup to occur when existing
|
|
the loop. If the callback is not set (NULL in strp_init) a
|
|
default function is used.
|
|
|
|
void (*abort_parser)(struct strparser *strp, int err);
|
|
|
|
This function is called when stream parser encounters an error
|
|
in parsing. The default function stops the stream parser for the
|
|
TCP socket and sets the error in the socket. The default function
|
|
can be changed by setting the callback to non-NULL in strp_init.
|
|
|
|
Functions
|
|
---------
|
|
|
|
The upper layer calls strp_tcp_data_ready when data is ready on the lower
|
|
socket for strparser to process. This should be called from a data_ready
|
|
callback that is set on the socket.
|
|
|
|
strp_stop is called to completely stop stream parser operations. This
|
|
is called internally when the stream parser encounters an error, and
|
|
it is called from the upper layer when unattaching a TCP socket.
|
|
|
|
strp_done is called to unattach the stream parser from the TCP socket.
|
|
This must be called after the stream processor has be stopped.
|
|
|
|
strp_check_rcv is called to check for new messages on the socket. This
|
|
is normally called at initialization of the a stream parser instance
|
|
of after strp_unpause.
|
|
|
|
Statistics
|
|
----------
|
|
|
|
Various counters are kept for each stream parser for a TCP socket.
|
|
These are in the strp_stats structure. strp_aggr_stats is a convenience
|
|
structure for accumulating statistics for multiple stream parser
|
|
instances. save_strp_stats and aggregate_strp_stats are helper functions
|
|
to save and aggregate statistics.
|
|
|
|
Message assembly limits
|
|
-----------------------
|
|
|
|
The stream parser provide mechanisms to limit the resources consumed by
|
|
message assembly.
|
|
|
|
A timer is set when assembly starts for a new message. The message
|
|
timeout is taken from rcvtime for the associated TCP socket. If the
|
|
timer fires before assembly completes the stream parser is aborted
|
|
and the ETIMEDOUT error is set on the TCP socket.
|
|
|
|
Message length is limited to the receive buffer size of the associated
|
|
TCP socket. If the length returned by parse_msg is greater than
|
|
the socket buffer size then the stream parser is aborted with
|
|
EMSGSIZE error set on the TCP socket. Note that this makes the
|
|
maximum size of receive skbuffs for a socket with a stream parser
|
|
to be 2*sk_rcvbuf of the TCP socket.
|