Louis Del Valle c8732dfa6f
Update sub_quadratic_attention.py
1. Determine the number of query chunks.
2. Calculate the final shape of the res tensor.
3. Initialize the tensor with the calculated shape and dtype, (same dtype as the input tensors, usually)

Can initialize the tensor as a zero-filled tensor with the correct shape and dtype, then compute the attention scores for each query chunk and fill the corresponding slice of tensor.
2023-05-10 22:05:18 -05:00
..
2023-05-10 11:55:09 +03:00
2023-05-10 09:02:23 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 11:05:02 +03:00
2023-05-10 11:05:02 +03:00
2023-05-10 11:05:02 +03:00
2023-05-10 08:43:42 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 09:02:23 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 07:52:45 +03:00
2022-09-07 12:32:28 +03:00
2023-05-10 09:02:23 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 08:43:42 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 08:43:42 +03:00
2023-05-10 23:41:08 +03:00
2023-05-10 09:02:23 +03:00
2023-05-10 08:43:42 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 09:02:23 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 09:02:23 +03:00
2023-05-10 11:05:02 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 08:43:42 +03:00
2023-05-10 11:05:02 +03:00
2023-05-10 08:43:42 +03:00
2023-05-10 11:05:02 +03:00
2023-05-10 08:43:42 +03:00
2023-05-10 09:02:23 +03:00
2023-05-10 11:19:16 +03:00
2023-05-10 11:05:02 +03:00
2023-05-10 09:02:23 +03:00
2023-05-10 08:43:42 +03:00
2023-05-10 21:21:32 +03:00
2023-05-10 08:43:42 +03:00
2023-05-10 08:43:42 +03:00
2023-04-29 09:17:35 +03:00
2023-05-10 07:52:45 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 23:41:08 +03:00
2023-05-10 08:43:42 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 23:41:08 +03:00
2023-05-10 11:37:18 +03:00
2023-05-10 08:43:42 +03:00