Index | Library | Commit hash | Language | Type of commit | Root Cause | Manifestation/End User Impact | IEEE arithmetic exception type | Background | Problem | DL Topic - level 1 | DL Topic - level 2 | DL Topic - level 3 | Patch type - level 1 | Patch type - level 2 | Patch type - level 3 | Old Solution | New Solution | Test | Math operation | References |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | PyTorch | ac72881f3ff8c46c2a5cf8b09d02babf46bc4c85 | CUDA | Fix | loss of precision | inaccurate result of mean in batch normalization | Inexact | Sync batch norm applies Batch Normalization over a N-Dimensional input (a mini-batch of [N-2]D inputs with additional channel dimension) y = ((x - E[x])/sqrt(Var[x] + epsilon)) * alpha + beta |
numerical issue in CUDA channels-last SyncBatchNorm, numerical issue of CUDA channels-last SyncBatchNorm', apex SBN channels-last also has this issue | data processing | batch normalization | batch normalization, Cuda | rewrite math formula | rewrite math formula | Replace: div_roundup() with ATenCeilDiv int div_roundup(int x, int y) { return lastPow2(1 + (x-1)/y); }, where lastPow2 returns 2**floor(log2(n)) ATenCeilDiv(T a, T b) { return (a + b - 1) / b; } |
int div_roundup(int x, int y) { return lastPow2(1 + (x-1)/y); } static int lastPow2(unsigned int n) { n |= (n >> 1); n |= (n >> 2); n |= (n >> 4); n |= (n >> 8); n |= (n >> 16); return std::max<int>(1, n - (n >> 1)); }, where |= is a bitwise or opearator |
ATenCeilDiv(T a, T b) { return (a + b - 1) / b; } |
def _batch_norm_stats(data): mean1, _ = torch.batch_norm_stats(data, 1e-5) mean2, _ = torch.batch_norm_stats(data.to(memory_format=torch.channels_last), 1e-5) mean_ref = torch.mean(data, (0, 2, 3), keepdim=False) self.assertEqual(mean_ref, mean1) self.assertEqual(mean_ref, mean2) |
division round up | https://arxiv.org/abs/1502.03167 https://pytorch.org/docs/stable/generated/torch.nn.SyncBatchNorm.html |
2 | PyTorch | dfc7fa03e5d33f909b9d7853dd001086f5d782a0 | Python | Fix | loss of precision | inaccurate result of gradient | Inexact | lower–upper (LU) decomposition (also called LU factorization) factors a matrix as the product of a lower triangular matrix and an upper triangular matrix. It is a procedure for decomposing an N×N matrix A into a product of a lower triangular matrix L and an upper triangular matrix U, LU=A. Matrix A = LU. In the lower triangular matrix all elements above the diagonal are zero, in the upper triangular matrix, all the elements below the diagonal are zero. LU decomposition is an efficient method used for solving a system of linear equations. Suppose we have B=AX and want to solve for X. (The solution could be X = inverse(A)B. But a matrix inverse is numerically unstable.) Find LU decomposition of A, A = LU. So, B=AX=LUX. Then solve for X with two equations: (1) LY = B and (2) UX = Y |
Matrix inverse is numerically unstable, as a result numerical and analytical gradients for LU decomposition are too different. gradients for the LU decomposition calculation is unstable, lu_backward is impelemented as autograd torch.det is using LU in forward, while det_backward is using svd_backward (singular value decomposition). The issue with svd_backward is that it is only stable for inputs with distinct singular values. As a result, TestGradientsCuda::test_fn_gradgrad_linalg_det_cuda_float64 fails on Windows with GPU, which compares the numerical and analytical gradient. SVD_backward is only stable for ranks n - 1 <= r <= n with singular values sufficiently far away from each other. |
gradients/derivatives | automatic differentiation | gradients for the LU decomposition, backward pass, autograd, linear algebra operations, determinant of a square matrix | use a different algorithm | use a different algorithm | Replace matrix inverse with solutions to systems of linear triangular equations. System of "triangular" equations refers to the equations having the form of a triangle, because of the lower equations containing only the later variables. However, works only for square matrices of full rank |
- I = LU_grad.new_zeros(LU_grad.shape) - I.diagonal(dim1=-2, dim2=-1).fill_(1) - Lt_inv = torch.triangular_solve(I, L, upper=False).solution.transpose(-1, -2) - Ut_inv = torch.triangular_solve(I, U, upper=True).solution.transpose(-1, -2) - - phi_L = (L.transpose(-1, -2) @ LU_grad).tril_() - phi_U = (LU_grad @ U.transpose(-1, -2)).triu_() - - self_grad_perturbed = Lt_inv @ (phi_L + phi_U) @ Ut_inv - return P @ self_grad_perturbed, None, None |
phi_L = (L.transpose(-1, -2).conj() @ LU_grad).tril_() + phi_U = (LU_grad @ U.transpose(-1, -2).conj()).triu_() + phi = phi_L + phi_U + X = torch.triangular_solve(phi, L.transpose(-1, -2).conj(), upper=True).solution + A_grad = torch.triangular_solve(X.transpose(-1, -2).conj() @ P.transpose(-1, -2), U, upper=True) \ + .solution.transpose(-1, -2).conj() + + return A_grad, None, None |
def sample_inputs_lu(op_info, device, dtype, requires_grad=False): + # not needed once OpInfo tests support Iterables + def generate_samples(): + batch_shapes = ((), (3,), (3, 3)) + for batch_shape, get_infos in product(batch_shapes, (True, False)): + shape = batch_shape + (S, S) + input = make_tensor(shape, device, dtype, requires_grad=requires_grad, low=None, high=None) + yield SampleInput(input, args=(True, get_infos)) + + return list(generate_samples()) |
matrix inverse, autograd | |
3 | PyTorch | 8e507ad00ebdfd0ae84bc03718e9c2cb74b8573b | yaml | Fix | overflow/underflow/loss of precision | Inaccurate result | overflow, underflow, inexact | This script defines derivative formulas and Python signatures of methods on Variables | Division formula in backward pass is unstable, because multiply two values can lead to loss of precision. When divisor value that is squared is large or small, which results in loss of precision. For extremely large values, the divisor may overflow and will evaluate to inf. For extremely small values the divisor will underflow and will evaluate to 0. | gradients/derivatives | automatic differentiation | backward pass, autograd, division, derivative, higher order gradients | rewrite math formula | rewrite math formula | Instead of dividing by other squared, divide by other twice. Mathematically x / y^2 = x / y / y, but if y is a large finite precision floating point number, then by performing y^2 you may lose precision. Successive divisions achieves the same result while not losing as much precision for large values of y | other: -grad * self / (other * other) | other: -grad * (self / other) / other | division | ||
4 | PyTorch | fe5d23cf4a9d8f673fb1bfc6e84c642fb6a23182 | C++ | Fix | loss of precision | incorrect result and NaN | Inexact | Cosine Similarity measures the cosine of the angle between two non-zero vectors of an inner product space. This similarity measurement is particularly concerned with orientation, rather than magnitude. In short, two cosine vectors that are aligned in the same orientation will have a similarity measurement of 1, whereas two vectors aligned perpendicularly will have a similarity of 0. If two vectors are diametrically opposed, meaning they are oriented in exactly opposite directions (i.e. back-to-back), then the similarity measurement is -1. Often, however, Cosine Similarity is used in positive space, between the bounds 0 and 1. Cosine Similarity is not concerned, and does not measure, differences is magnitude (length), and is only a representation of similarities in orientation. | Cosine similarity implementation that may lose precision and return a value greater than 1.0, which is incorrect, because cosine similarity outputs are in range of -1 and 1. | linear algebra | distance | cosine similarity distance | rewrite math formula | rewrite math formula | Use x / sqrt(x * x) instead of x / (sqrt(x) * sqrt(x)) followig scipy implementation | - Tensor n12 = (w1 * w2).rsqrt_().clamp_max(1.0 / eps); - return w12.mul_(n12); |
Tensor n12 = (w1 * w2).clamp_min_(eps * eps).sqrt_(); + return w12.div_(n12); |
# Check dividing by 0. + input1 = torch.randn(10).requires_grad_() + input2 = torch.zeros_like(input1).requires_grad_() + torch.cosine_similarity(input1, input2, 0).sum().backward() + self.assertEqual(input1.grad, torch.zeros_like(input1)) + self.assertEqual(input2.grad, input1 * 1e8) |
reciprocal of square root | |
5 | Tensorflow/Keras | 646d25d15910dc5cc3532aebb7e8395487adad4f | C++ | Fix | overflow/underflow | softmax output is NaN | overflow, underflow | Softmax is a normalized exponential function that takes a vector of n real values as input and outputs a vector of n real values that represent a probability distribution and sum up to 1. In deep learning classifiers, softmax is used in the last layer, because it normalizes the output of the prior network layer, a vector with size n, to a probability distribution over n predicted output classes. | Direct calculation of the softmax function according to its definition formula is conjugate with numerical issues. Single-precision exp(x) function overflows for x > 89 and underflows for x < −104, and, in turn, cause NaN outputs in the na¨ıve implementations. | activation functions | activation functions | softmax, metal GPU acceleration | use a different algorithm | use a different algorithm | Implement a tree pass softmax algorithm, see algorithm in https://arxiv.org/pdf/2001.04438.pdf | softmax | https://arxiv.org/pdf/2001.04438.pdf | |||
6 | Tensorflow/Keras | a3d726ae8246371515a0f666c38668e9da7765f9 | C++ | Fix | underflow | error due to divide by zero | invalid operation, underflow | compute the centered RMSProp, the gradient is normalized by an estimation of its variance | The denominator in centered RMSProp optimizer does not add a small epsilon as the last operation. This will not be effective af preventing underflow. Given the current formula ms + eps - mg.square, if ms and mg.square are of very similar magnitude, subtracting two similar numbers will lead to loss of significant digits, which has a risk of underflow. Because the epsilon was added to ms prior to that, it will not prevent overflow | optimizers | optimizers | centered RMSprop optimizer | rewrite math formula | rewrite math formula | Rewrite the order of operations. Reordered the sum (ms - mg^2 + epsilon) to add epsilon last for numerical stability both on CPU and GPU. | auto denom = ms + epsilon() - mg.square(); auto denom = epsilon.reshape(single).broadcast(bcast) + ms - mg.square().sqrt() |
auto denom = (ms - mg.square()) + epsilon() auto denom = (ms - mg.square()) + epsilon.reshape(single).broadcast(bcast) |
|||
7 | PyTorch | 6a458512c22c908b19f49262fd0f32a14425ec80 | C++ | Fix | loss of precision | assertion error | Inexact | static _cast converts the type of variable static_cast can perform conversions between pointers to related classes, not only upcasts (from pointer-to-derived to pointer-to-base), but also downcasts (from pointer-to-base to pointer-to-derived). No checks are performed during runtime to guarantee that the object being converted is in fact a full object of the destination type. |
function test_computes_cubic_kernel returns an assertion error saying that input is less than 1e-5, which is untrue. The input is slightly larger: 1.0790e-05. The cause is the precision of a variable returned by a function that performs the power operation (x to the power of y). On x86_64 a long double will utilize the x87 (the 8087 was the floating point co-processor of the 8086, now it is on the same die as modern amd64 processor) special and proprietary 80 bit float. This 80 bit floating point type is not a part of the IEEE 754 floating point standard. Even though it has more bits of precision, its lack of standardization and its niche nature means that it will often be the cause of stability issues, and is not worth using. |
tensor math | tensor math | power, low level math | increase variable precision/change variable type | increase variable precision/change variable type | Stop using long doubles, they will only cause you trouble. Instead just use the same type as the function input | power | https://en.wikipedia.org/wiki/X87#Performance | |||
8 | Tensorflow/Keras | d4b5c606fc9fbd1a20b5b113b4bc831f31d889a3 | Python | fix | loss of precision | Dividing by a number that is squared results in dividing by a very large or small number. The square operation could overflow or underflow respectively and if that does not happen, there is a risk of loss of precision due to dividing two very different magnitudes | gradients/derivatives | gradients | gradient | rewrite math formula | rewrite math formula | Avoid a square value in denomator and rewrite division as (-x/y)/y instead of (-x/y^2). They are mathematically equivalent, but the first formula avoids dividing by very large or very small numbers. Proof that they are mathematically equivalent: (-x/y)/y = (-x/y)*(1/y) = -x/(y^2) | math_ops.reduce_sum(grad * math_ops.div(-x, math_ops.square(y)) | math_ops.reduce_sum(grad * math_ops.div(math_ops.div(-x, y), y) | division | |||||
9 | Tensorflow/Keras | 2411514c726f4ccd98e864e8b2e253e6df99c39d | C++ | fix | loss of precision | The formula for dequantization in quantization range for multiplication is numerically unstable | quantization | quantization | dequantization | rewrite math formula | rewrite math formula | rewrite the order of operations. Specifically, rewrite q_range_min + (input_array - 1_lowest) * q_range_scale to the following: q_range_min - (q_lowest * q_range_scale + input_array * q_range_scale), which is mathematically equivalent | #define DEQUANTIZE_WITH_EIGEN(input_array, q2f) \ - (q2f.range_min + \ - (((input_array.template cast<float>() - q2f.lowest_quantized())) * \ - q2f.range_scale)); |
#define DEQUANTIZE_WITH_EIGEN(input_array, q2f) \ + ((q2f.range_min - q2f.lowest_quantized() * q2f.range_scale) + \ + input_array.template cast<float>() * q2f.range_scale) |
// Test for signed 32 bit. + // Note that we cannot use input mins and maxes that match the range because + // there are 7 too few bits of mantissa accuracy in floats to represent + // 2**31-1 accurately. Also there is no good fraction to use because 2**31-1 + // is a mersenne prime. + Tensor input32(DT_QINT32, TensorShape({input_height, input_width})); + + // Use a quantizer centered at 0. + float input_range = 1LL << 25; + int64 num_levels = (1LL << 32) - 1; + float step_size = + static_cast<float>(static_cast<double>(input_range) / num_levels); + float q_compatible_min_value = + roundf(-(input_range / 2.0) / step_size) * step_size; + float q_compatible_max_value = q_compatible_min_value + input_range; + test::FillValues<qint32>(&input32, {-16384, 0, 16256, -13440, -13312, -13184, + 14720, 14848, 14976}); + + Tensor output32 = QuantizedTensorToFloat<qint32>( + input32, q_compatible_min_value, q_compatible_max_value); + test::FillValues<float>(&expected, {-128.0f, 0.0f, 127.0f, -105.0f, -104.0f, + -103.0f, 115.0f, 116.0f, 117.0f}); + // The quantization error in going between 1<<25 and 1<<32 levels. + const double kTolerance = .5 / 128.0; + test::ExpectTensorNear<float>(expected, output32, kTolerance); |
|||||
10 | PyTorch | 43ab91118226b330be6d2274a154b98da233d879 | C | Fix | loss of precision | Inaccurate result | Inexact | Dirichlet distribution is a family of continuous multivariate probability distributions parameterized by a vector Alpha of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD). Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution. In Bayesian probability theory, if the posterior distribution p(θ|x) and the prior distribution p(θ) are from the same probability distribution family, then the prior and posterior are called conjugate distributions, and the prior is the conjugate prior for the likelihood function. The saddle point technique is a method for deriving an accurate approximation for the probability density function of the mean of a random sample. A point that is not a local extremum yet has zero gradient is called a saddle point, such point can occur in non-convex functions. |
low precision of gradient approximation in Dirichlet distribution | statistical distributions | statistical distributions | distributions, Dirichlet distribution, gradient approximation | use a different algorithm | use a different algorithm | Use Taylor expansion and Rice saddle point expansion to approximate gradient and use higher precision types for that computation | https://en.wikipedia.org/wiki/Dirichlet_distribution | ||||
11 | PyTorch | ae1a972d78950abc4dab372f496914b5e78b9637 | C++ | Fix | loss of precision | inaccurate result | Log softmax is an activation function used in the last layer of a neural network that outputs log probabilities | loss of precision in log_softmax cpu code when inputs are big but their differences are small | activation functions | activation functions | log softmax | rewrite math formula | rewrite math formula | Rewrite order of operations to avoid loss of significat digits when subtracting two numbers of very similar magnitude. Change order of operations so that a large number is first subtracted by another large number before adding a small number. | tmpsum = max_input + std::log(tmpsum); output_data[d * dim_stride] = input_data[d * dim_stride] - tmpsum; |
tmpsum = std::log(tmpsum); output_data[d * dim_stride] = + input_data[d * dim_stride] - max_input - tmpsum; |
log(exp(x_i)/sum(exp(x)) | log, exp, division, sum | ||
12 | PyTorch | 0c588a500b2219c028eefe595cff0829fd982f52 | Python | Fix | loss of precision | SigmoidCrossEntropyWithLogits computes sigmoid cross entropy given logits. Sigmoid cross-entropy is a Sigmoid activation plus a Cross-Entropy loss. | Using sigmoid followed by a multinomial logistic loss layer can be less stable than a single layer of sigmoid cross entropy with logits | loss functions | loss functions, activation functions | cross entropy, sigmoid | use a different algorithm | use a different algorithm | Use a single layer of sigmoid cross entropy with logits instead. Replace sigmoid + xent loss with SigmoidCrossEntropyWithLogits. The sigmoid layer computes the multinomial logistic loss of the sigmoid of its inputs. It's conceptually identical to a sigmoid layer followed by a multinomial logistic loss layer, but provides a more numerical stable gradient. | |||||||
13 | PyTorch | 3d06a1e075ef0e6f4bf862d13e83cdd4b02dbc32 | Cuda | Fix | loss of precision | Welford’s method is a usable single-pass method for computing the variance. It can be derived by looking at the differences between the sums of squared differences for N and N-1 samples. Algorithm: variance(samples): M := 0 S := 0 for k from 1 to N: x := samples[k] oldM := M M := M + (x-M)/k S := S + (x-M)*(x-oldM) return S/(N-1) |
THCTensor_varInnermostDim numerically unstable | tensor math | tensor math | low level tensor math, variance calculation, GPU | use a different algorithm | use a different algorithm | Make THCTensor_varInnermostDim numerically stable using Welford's algorithm (#3425) * Use Welford's algorithm when reducing along inner dimension for THCTensor's variance fn * Use accreals in THCTensor's varInnermostDim |
variance | https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm | |||||
14 | PyTorch | 638f0b5d78fe5ff2e484dc573c35b97a4bcf4e82 | Python | Fix | invalid input | loss = NaN | invalid operation | Negative log likelihood loss with Poisson distribution of target. The Poisson distribution is used to model the number of events occurring within a given time interval. target∼Poisson(input)loss(input,target)=input−target∗log(input)+log(target!) |
log(0) = NaN in poisson negative log likelihood loss function | loss functions | loss functions | loss, poisson negative log likelihood loss | rewrite math formula | rewrite math formula | Add small epsilon to prevent log(0) following keras implementation, eps=1e-8 | `input - target * log(input)`. Default: True | log_input=False. Default: 1e-8 `input - target * log(input+eps)`. Defrue |
log | https://pytorch.org/docs/stable/generated/torch.nn.PoissonNLLLoss.html | |
15 | PyTorch | 81b995514ea908b635d725e11d1b91ac7ad03eb0 | C | Fix | overflow/loss of precision | Welford’s method is a usable single-pass method for computing the variance. It can be derived by looking at the differences between the sums of squared differences for N and N-1 samples. Algorithm: variance(samples): M := 0 S := 0 for k from 1 to N: x := samples[k] oldM := M M := M + (x-M)/k S := S + (x-M)*(x-oldM) return S/(N-1) |
numerical stability of std and var of THTensor, formulas for the variance may involve sums of squares, which causes loss of precision or overflow when dealing with large values | tensor math | tensor math | low level tensor math, variance and standard deviation calculation, CPU | use a different algorithm | use a different algorithm | Use Welford’s algorithm for better numerical stability | tensor = torch.FloatTensor([1.0, 2.0, 3.0]) self.assertEqual(tensor.var(unbiased=True), 1.0) self.assertEqual(tensor.var(unbiased=False), 2.0 / 3.0) |
variance, standard deviation | |||||
16 | PyTorch | 455038e470dd60dae45f68948ae876b1931a8bf0 | Cuda | Fix | overflow/underflow | Spatial logsoftmax computes the log of spatial softmax. Spatial softmax returns the expected pixel locations of each feature map in a CNN and hence, be better described as spatial soft argmax. It is defined in https://arxiv.org/pdf/1504.00702.pdf. Each output channel of the softmax is a probability distribution over the location of a feature in the image. To convert from this distribution to a coordinate representation (fcx, fcy), the network calculates the expected image position of each feature, yielding a 2D coordinate for each channel. s_cij = e^(a_cij) / sum_from_i'_to_j'(e^(a_ci'j') ), where i and j are coordinates specifing location in an image |
Spatial log softmax in CUDA backend for the Neural Network Package is not stable | activation functions | activation functions | spatial log softmax, CNN | rewrite math formula | rewrite math formula | It appears to be reducing the sum by the maxiumum input vector at each iteration of accumulating sum. This may be to ensure that the input to exp() is not too large. | sum += THCNumerics<T>::exp(input[inputStartIndex + i]); sum = AccumT(1) / sum; output[outputIndex] = ScalarConvert<AccumT, T>::to( - THCNumerics<AccumT>::log(sum * THCNumerics<T>::exp(input[inputStartIndex + i]))); |
T maxInput = input[inputStartIndex]; + for (int i = 1; i < classSize; i++) { + T value = input[inputStartIndex + i]; + maxInput = THCNumerics<T>::ge(maxInput, value) ? maxInput : value; + } + sum += THCNumerics<T>::exp(input[inputStartIndex + i] - maxInput); + T logsum = maxInput + ScalarConvert<AccumT, T>::to(THCNumerics<AccumT>::log(sum)); + output[outputIndex] = input[inputStartIndex + i] - logsum; |
spatial logsoftmax, log, exp, scalar convert | https://arxiv.org/pdf/1504.00702.pdf | |||
17 | PyTorch | c010ef7f0c6d837809a7e973048afac76373e3de | Cuda | Fix | overflow | Cuda block is a group of threads that execute the same task. CUDA blocks are grouped into a grid. A kernel (i.e.: Cuda functipn) is executed as a grid of blocks of threads. | Overflow issue in GET_BLOCKS Cuda function that returns the number of blocks used for scheduling blocks in Cuda device (i.e.: Nvidia GPU), because addition operations on N could cause an overflow for large N. | other | Cuda blocks | Cuda thread scheduling | rewrite math formula | rewrite math formula | Rather than directly adding to N, rearrange the operations to shrink N first. | (N + CUDA_NUM_THREADS - 1) / CUDA_NUM_THREADS; | auto block_num = (N - 1) / CUDA_NUM_THREADS + 1; | division | ||||
18 | PyTorch | 6be3e5d3bb00a288da51bd368c5342c8676bbcf7 | Python | Unit test | loss of precision | One of basic units of computation in Caffe2 are the Operators. Operators in Caffe2 are kind of like functions. CAFFE (Convolutional Architecture for Fast Feature Embedding) is a deep learning framework, originally developed at University of California, Berkeley. It is open source, under a BSD license. It is written in C++, with a Python interface. | unstable formula for updating gradient and momentum for adagrad optimizer in Caffe 2 in operators test script | optimizers | optimizers | adagrad testing, optimizer, gradients, caffe2, weight decay, momentum | rewrite math formula | rewrite math formula | Rewrite formula, specifically change updating gradient using a temp variable (change x+= y to temp = x + y) | grad += weight_decay * param_in_f32 | grad_temp = grad + weight_decay * param_in_f32 | |||||
19 | PyTorch | 0b7e8323256e56728e1ffc9ee5d701987af3d06c | C++ | Unit test | overflow | The primary difference between const and constexpr variables is that the initialization of a const variable can be deferred until run time. A constexpr variable must be initialized at compile time. | signed integer overflow of variable range | other | random number generator | random number generator testing | increase variable precision/change variable type | increase variable precision/change variable type | change type of variable range from signed to unsigned int 64 bits and change type from const auto to constexpr and |
const int64_t max_val = std::is_floating_point<T>::value ? int64_max_val : static_cast<int64_t>(t_max_val); range = *to - from; range = max_val - from + 1; |
range = static_cast<uint64_t>(*to) - static_cast<uint64_t>(from); range = static_cast<uint64_t>(max_val) - static_cast<uint64_t>(from) + 1; |
|||||
20 | PyTorch | 470c496eb224bdd735eea1accf7269dfdd87d49f | Python | Fix | loss of precision | Cholesky inverse = Compute inverse of Hermitian positive definite matrix using Cholesky factorization inverse(S) = inverse(LL*) |
In multivariate normal distribution class, there is a function for computing the precision matrix that uses inverse, which is numerically unstable | statistical distributions | statistical distributions | multivariate normal distribution, precision matrix | use a different algorithm | use a different algorithm | Replace the naive inverse with a cholesky inverse for improved stability | - scale_tril_inv = torch.inverse(self._unbroadcasted_scale_tril) - return torch.matmul(scale_tril_inv.transpose(-1, -2), scale_tril_inv).expand( |
identity = torch.eye(self.loc.size(-1), device=self.loc.device, dtype=self.loc.dtype) + # TODO: use cholesky_inverse when its batching is supported + return torch.cholesky_solve(identity, self._unbroadcasted_scale_tril).expand( |
matrix inverse | ||||
21 | PyTorch | 071971476d7431a24e527bdc181981678055a95d | Python | Fix | overflow | torch.clamp(input, min, max, *, out=None) → Tensor Clamp all elements in input into the range [ min, max ]. |
Binomial distribution class encounters overflow when logits are large. Note: the binomial distribution is parametrized by logits | statistical distributions | statistical distributions | Binomial distribution, log probability | rewrite math formula | rewrite math formula | Rewrite equation for log_prob method and use a custom clamp function on logits to ensure they are of certain value. The custom clamp function works like torch.clamp, except for that it returns 0.5 when gradient = 0 and value = 0 |
- return (log_factorial_n - log_factorial_k - log_factorial_nmk + - value * self.logits - self.total_count * torch.log1p(self.logits.exp())) |
def _clamp_by_zero(x): + # works like clamp(x, min=0) but has grad at 0 is 0.5 + return (x.clamp(min=0) + x - x.clamp(max=0)) / 2 + normalize_term = (self.total_count * _clamp_by_zero(self.logits) + + self.total_count * torch.log1p(torch.exp(-torch.abs(self.logits))) + - log_factorial_n) + return value * self.logits - log_factorial_k - log_factorial_nmk - normalize_term |
def test_binomial_stable(self): + logits = torch.tensor([-100., 100.], dtype=torch.float) + total_count = 1. + x = torch.tensor([0., 0.], dtype=torch.float) + log_prob = Binomial(total_count, logits=logits).log_prob(x) + self.assertTrue(torch.isfinite(log_prob).all()) + + # make sure that the grad at logits=0, value=0 is 0.5 + x = torch.tensor(0., requires_grad=True) + y = Binomial(total_count, logits=x).log_prob(torch.tensor(0.)) + self.assertEqual(grad(y, x)[0], torch.tensor(-0.5)) |
||||
22 | PyTorch | 3dcc329746223bc24f8213ccbaa5eba09273e162 | C++ | Fix | loss of precision | Inaccurate result | Inexact | Summation of numbers should be performed from smallest to largest to avoid loss of significant digits | Loss of precision and floating point truncation in summation formula. Summing many floating point values can lead to loss in precision if the values are different orders of magnitude. | tensor math | tensor math | summation, tensor math | use a different algorithm | use a different algorithm | Use a tree based approach where items of similar orders of magnitude are summed together to avoid numerical instability. This algorithm does the summation along a single axis with multiple "levels" of accumulator, each of which is designed to hold the sum of an order of magnitude more values than the previous. e.g. if there are 2^16 elements, the first level will hold the sum of 2^4 elements, and so on in increasing powers of 2: 2^4, 2^8, 2^12 and finally 2^16. This limits the differences in magnitude of the partial results being added together, and so we don't lose accuracy as the axis length increases. |
A simplified recursive implementation would look like this: + + scalar_t row_sum(const scalar_t * data, int64_t n) { + // Note, in practice the chunk size can increase with n + // This allows the recursion depth to be limited to O(1). + constexpr int64_t min_chunk_size = 16; + + scalar_t sum = 0; + if (n <= min_chunk_size) { + // Recursive base case, calculate a simple running sum + for (int64_t i = 0; i < n; ++i) { + sum += data[i]; + } + return sum; + } + + // Recursively sum larger chunks of elements + const int64_t chunk_size = std::max(divup(n, min_chunk_size), min_chunk_size); + for (int64_t i = 0; i < n; i += chunk_size) { + sum += row_sum(data + i, std::min(chunk_size, n - i)); + } + return sum; + } |
ASSERT_NEAR(norm_after, max_norm, 1e-6); | sum | ||
23 | PyTorch | d16c8238e164c6499714de625eb73422382e5ec1 | Python | Fix | overflow/underflow | Inaccurate result, NaN | overflow, underflow, inexact | Softmax function turns a vector of K real values into a vector of K real values that sum to 1. The input values can be positive, negative, zero, or greater than one, but the softmax transforms them into values between 0 and 1, so that they can be interpreted as probabilities. I | Implementation of softmax for certain cases (when the dim argument of softmax and axis do not equal to ndim - 1, where ndim - 1 = the last dimension) is numerically unstable. Large inputs into the exponential function will produce infinity and output of softmax becomes NaN. | activation functions | activation functions | softmax | use a different algorithm | use a different algorithm | Transpose input to allow for using ONNX softmax numerically stable implementation | softmax | exp | |||
24 | PyTorch | b403b10ff98a6bc1a238e7ba4eee6393b6b89048 | C++ | Fix | loss of precision | categorical cross entropy yields inacurate result | Inexact | When a small float is subtracted from a large float, the large float is not changing in value (as it should mathematically). logsoftmax not working for large logits, as a result nn.CrossEntropyLoss() yields incorrect results for big logits | loss functions | loss functions | logsoftmax, cross entropy loss | rewrite math formula | rewrite math formula | Rewrite formula considering maximum input. If we add a very small number to a large one, the small number will be ignored. Example: tmpsum = 1e8 + log(2) = 1e8. Numerically with float precision the log(2) is ignored so at the end we basically have 1e8 - (1e8 + log(2)) = 0 instead of -log(2). | [tmp_sum](Vec x) { return x - Vec(tmp_sum); } | [tmp_sum](Vec x) { return x - Vec(tmp_sum); } | def test_log_softmax(self): + x_small = torch.ones(1, 2, dtype=torch.float32) + x_big = x_small + 1e16 + self.assertEqual(F.log_softmax(x_small, -1), F.log_softmax(x_big, -1)) |
log softmax | subtaction | |
25 | PyTorch | f8cab38578a99ad04d23256c2da877db4814f76f | Python | Fix | invalid operation | ? | Only a positive definite matrix has a unique Cholesky factorization A = RTR, where R is upper triangular with positive diagonal elements. A positive definite matrix = symmetric matrix with all positive eigenvalues. Cholesky decomposition is roughly twice as efficient as the LU decomposition for solving systems of linear equations. A = RTR, R is called the Cholesky factor of A. | Matrix inverse triggers a cholesky error, because the matrix is not positive definite. Also, matrix inverse can cause numerical instability. | statistical distributions | statistical distributions | Gaussian distribution | rewrite math formula | rewrite math formula | only take inverse of a triangular matrix | def _precision_to_scale_tril(P): + # Ref: https://nbviewer.jupyter.org/gist/fehiepsi/5ef8e09e61604f10607380467eb82006#Precision-to-scale_tril + Lf = torch.cholesky(torch.flip(P, (-2, -1))) + L_inv = torch.transpose(torch.flip(Lf, (-2, -1)), -2, -1) + L = torch.triangular_solve(torch.eye(P.shape[-1], dtype=P.dtype, device=P.device), + L_inv, upper=False)[0] + return L |
matrix inverse | ||||
26 | PyTorch | c1790fa202f30e3aca1d1ecb31f26e0b3bb1e69f | Cuda, C++ | Fix | loss of precision | linear interpolation is a method of curve fitting using linear polynomials to construct new data points within the range of a discrete set of known data points.Parameters of lerp: a, b, t, Output: a + t (b-a), The parameter t defines where to estimate the value on the interpolated line, it is 0 at the first point and 1 and the second point. For interpolated values between the two points mu ranges between 0 and 1. https://en.wikipedia.org/wiki/Linear_interpolation#Programming_language_support | unstable formula for linear interpolation | tensor math | linear interpolation | linear interpolation | rewrite math formula | rewrite math formula | rewrite formula // Imprecise method, which does not guarantee v = v1 when t = 1, due to floating-point arithmetic error. This method is monotonic // This form may be used when the hardware has a native fused multiply-add instruction. float lerp(float v0, float v1, float t) { return v0 + t * (v1 - v0); } // Precise method, which guarantees v = v1 when t = 1. This method is monotonic only when v0 * v1 < 0. Lerping between same values might not produce the same value float lerp(float v0, float v1, float t) { return (1 - t) * v0 + t * v1; } |
ret_val = self_val + weight_val * (end_val - self_val); |
ret_val = (weight_val < 0.5) ? self_val + weight_val * (end_val - self_val) : end_val - (end_val - self_val) * (1 - weight_val); |
a + t (b-a) | linear interpolation | |||
27 | PyTorch | e17b8dea1dd30bef55b314b0217f79ce22a13cf9 | C++ | Fix | overflow | In C and C++, integer literals are interpreted as an `int` type unless specified otherwise by using a trailing L for long and LL for long long, e.g., 42 is an int, 42L is a long, 42LL is a long long. On x86_64 systems using GNU toolchains on Linux, this is 32, 64, and 64 bits respectively. $ cat long.c #include <stdio.h> int main(void) { printf("sizeof(42) = %lu\n", sizeof(42)); printf("sizeof(42L) = %lu\n", sizeof(42L)); printf("sizeof(42LL) = %lu\n", sizeof(42LL)); return 0; } (py38) kyle@fulltower:~ $ gcc long.c (py38) kyle@fulltower:~ $ ./a.out sizeof(42) = 4 sizeof(42L) = 8 sizeof(42LL) = 8 |
Accumulator is overflowing because the starting value of accumulation is too small of a type (int) to accomodate the size of inputs that are common in Pytorch. Calculation of number of elements (e.g.: number of batches) overflows, because return type does not have enough precision to hold the result. | linear algebra | linear algebra | linear algebra, distance | increase variable precision/change variable type | change variable type | Use 64 bit type for accumulator. |
- int64_t numel = std::accumulate(oldshape.begin(), oldshape.end(), 1, - std::multiplies<int64_t>()); |
const int64_t numel = prod_intlist(oldshape); | |||||
28 | PyTorch | 56840f0a81e4460089740d50d3768f37e79a17fc | Cuda | Fix | overflow | In binary search, the variables used to represent the indices will often be of fixed size (integers), and this can result in an arithmetic overflow for very large arrays. If the midpoint of the span is calculated as (L+R)/2, then the value of L+R may exceed the range of integers of the data type used to store the midpoint, even if L and R are within the range. If L and R are nonnegative, this can be avoided by calculating the midpoint as L+ ((R-L)/2) Bucketize bucketizes 'input' based on 'boundaries'. Summary For example, if the inputs are boundaries = [0, 10, 100] input = [[-5, 10000] [150, 10] [5, 100]] then the output will be output = [[0, 3] [3, 2] [1, 3]] |
Possible overflow when adding two 32 bit ints in binary search algorithm when calculating the midpoint | other | bucketize | binary search, bucketize operation | rewrite math formula | rewrite math formula | By first subtracting low from high, this assures that this intermediate calculation will not overflow its 32 bit datatype. | int32_t median = (high + low) / 2; | const int32_t median = low + (high - low) / 2; | int32_t mp1(int32_t a, int32_t b){ return (a+b)/2; } int32_t mp2(int32_t a, int32_t b){ return a+(b-a)/2; } int main(){ int32_t low=-1; for(int32_t high=1;high<10000;high++){ if(mp1(low,high)!=mp2(low,high)){ std::cout<<"Ahhhh!"<<std::endl; } } } |
||||
29 | PyTorch | 7f42d1c98a72855806bd35ef27ce6823837e0816 | C++ | Fix | loss of precision | Python "floats" are actually doubles internally | Originally a float was used, which has less precision than a double. JIT only supports double, not float. So when insertConstant, we need to cast the python `float_` to double instead of float. This will fix the incorrect `math.pi` and other high precision constants value. | other | other | python bindings from C++, low level math, constants | increase variable precision/change variable type | increase variable precision | When converting a Pyobject representation into a C++ representation, use a double instead of a float | return toSimple(g.insertConstant(py::cast<float>(obj), loc)); | return toSimple(g.insertConstant(py::cast<double>(obj), loc)); | |||||
30 | PyTorch | c784f847debc6f6a30b41da6853517b2ccd3ddf0 | C++ | Fix | overflow | int is 32 bits on amd64/Linux/GNU. sizes and indexes should use size_t in order to use the word size of the current platform, which allows one to index as many elements as could possibly fit into memory. | sparse_adagrad param_size overflow error | optimizers | optimizers | adagrad optimizer | increase variable precision/change variable type | increase variable precision | Correctly replace the data type of a size from "int" to "size_t" | int param_size uint64_t idx_pref = indices[i_pref]; |
size_t param_size auto idx_pref = indices[i_pref]; |
|||||
31 | PyTorch | 76c1b5cd794c44e4fec8da1d87ec8f0ccc045e68 | C++ | Fix | overflow | Std::numeric_limits = way to query various properties of arithmetic types | Reusing a variable who's data type (precision) depends on the template argument. Bug: caffe2/caffe2/operators/stats_put_ops.h:66:25: runtime error: 9.22337e+18 is outside the range of representable values of type 'long' . The assignment from int64_t to float loses some precision and because of that we overflow |
other | external library | Caffe operators | increase variable precision/change variable type | increase variable precision | increase precision computation to int 64 and as opposed to converting to int64_t at the end from float add overflow safeguard using std::numeric_limits |
- input = 0; - } else if (input < -bound_value) { - input = -bound_value; - } else if (input > bound_value) { - input = bound_value; - int64_t int_value = input * magnitude_expand_; |
int_value = 0; + } else if (input <= -bound_value) { + int_value = std::numeric_limits<int64_t>::min(); + } else if (input >= bound_value) { + int_value = std::numeric_limits<int64_t>::max(); + } else { + int_value = input * magnitude_expand_; } } else { CAFFE_ENFORCE( std::abs(static_cast<int64_t>(input)) < bound_value, "Input value is too large for the given magnitude expansion!"); CAFFE_ENFORCE(!isNan(input), "Input value cannot be NaN!"); + int_value = input * magnitude_expand_; } |
def test_clamp_with_out_of_bounds(self): + put_value = float(1e20) + magnitude_expand = 1000000000000 + stat_name = "stat".encode('ascii') + sum_postfix = "/stat_value/sum".encode("ascii") + count_postfix = "/stat_value/count".encode("ascii") + + workspace.FeedBlob("value", np.array([put_value], dtype=np.float)) + + workspace.RunOperatorOnce(core.CreateOperator( + "AveragePut", + "value", + [], + stat_name=stat_name, + magnitude_expand=magnitude_expand, + bound=True)) + + workspace.RunOperatorOnce(core.CreateOperator( + 'StatRegistryExport', [], ['k', 'v', 't'])) + + k = workspace.FetchBlob('k') + v = workspace.FetchBlob('v') + + stat_dict = dict(zip(k, v)) + + self.assertIn(stat_name + sum_postfix, stat_dict) + self.assertIn(stat_name + count_postfix, stat_dict) + self.assertEquals(stat_dict[stat_name + sum_postfix], + 9223372036854775807) self.assertEquals(stat_dict[stat_name + count_postfix], 1) |
||||
32 | PyTorch | 08b1324ec26043b1acfaf4b65335c671c8658a3c | C | Fix | overflow | integer overflow in remainder operator | tensor math | tensor math | tensor math, remainder operator | rewrite math formula | rewrite math formula, add overflow check | The sign of the result of modulo should be the same as the denominator. This commit checks that those signs have not flipped, which would indicate an overflow. There is a bug, however, because signed overflow is undefined behavior in C, and therefore the compiler is allowed to emit any machine code for this. A compiler upgrade may break this code, or more likely, not emit machine code for this condition check, since a signed integer cannot overflow as per the standard, and thus the compiler can ignore that as a condition. | TensorRemainderOp(T v) : val(v) {} __device__ __forceinline__ void operator()(T* out, T* in) { *out = *in % val; - if ((*out * val) < 0){ *out += val; } |
static inline bool has_different_sign(real a, real b) { + return (a < 0) != (b < 0); +} TensorRemainderOp(T v) : val(v) {} __device__ __forceinline__ void operator()(T* out, T* in) { *out = *in % val; + if (has_different_sign<T>(*out, val)){ *out += val; } |
def _test_remainder_overflow(self, dtype=torch.int64): + # Check Integer Overflows + x = torch.tensor(23500, dtype=dtype) + q = 392486996410368 + self.assertEqual(x % q, x) + self.assertEqual(-x % q, q - x) + self.assertEqual(x % -q, x - q) + self.assertEqual(-x % -q, -x) + + def test_remainder_overflow(self): + self._test_remainder_overflow(self, dtype=torch.int64) For CUDA: + def test_remainder_overflow(self): + TestTorch._test_remainder_overflow(self, dtype=torch.cuda.int64) |
|||||
33 | PyTorch | 6185b27cc6645d8055b76f9cc330b010d1c2a258 | C++ | Fix | loss of precision | Standard_gamma_grad computes the reparameterized gradient -(d/dalpha cdf(x;alpha)) / pdf(x;alpha) for random number x drawn from a standard Gamma distribution Gamma(alpha) standard_gamma_grad_one(scalar alpha, scalar x) |
low precision of gamma distribution gradient | statistical distributions | statistical distributions | gradients, gamma distribution | use a different algorithm | use a different algorithm | Use Taylor series expansion and Rice saddle point expansion instead of asymptotic approximation for caluclating gamma distribution gradient. In particular, use a Taylor series expansion for small x and a Rice saddle point expansion for large alpha. | |||||||
34 | PyTorch | c43b120d4329dbcbed114eae8b4cfb23f11b3779 | C | Fix | loss of precision | linspace operation creates a one-dimensional tensor of size steps whose values are evenly spaced from start to end, inclusive. | low float precision in linear approximation operation | tensor math | tensor math | linspace (i.e.: 1D tensor creation) | rewrite math formula | rewrite math formula | reverse order of multiplication and division. Move variable i from numerator to denominator. However, this formula would yield a different result. | *r__data = a + i*(b-a)/((real)(n-1)); | *r__data = a + (b-a)/((real)(n-1))*i; | start, start + (end-start)/(steps-1), ..., start + (steps - 2) * (end-start)/(steps-1) | multiply, divide, add, subtract | |||
35 | PyTorch | 415658836538d69362ed5482dc5fbfdba39a1c69 | C++ | Unit test | hardware | result of log is sligtly different on different hardware platforms | Inexact | Logarithms are easy to compute in some cases, such as log10(1000) = 3. In general, logarithms can be calculated using power series or the arithmetic–geometric mean, or be retrieved from a precalculated logarithm table that provides a fixed precision.[ | Log approximation is not bitwise identical on different hardware platforms. Different processors (in this case broadwell vs skylake) can have different behavoir when it comes to floating point operations. Log is implemented in software, so different hardware platforms using the same software stack may be using the same algorithm to calculate log, but the primitive operations used for floating point operations can be different between different FPU implementations. | tensor math | tensor math | tesing output accuracy, log approximation, hardware, tensor math | relax accuracy test tolerance | relax accuracy test tolerance | Rather than asserting bit by bit perfect match, instead compare with a tollerence of 32 bit floating point epsilon. Epsilon is the smallest number such that when added to the floating point number 1.0, yields a value greater than 1.0. Allow 1 ULP (unit in place) tolerance by allowing for epsilon relative tolerance error. Epsilon is defined using C++ standard library's numeric_limits for float, which returns the machine epsilon, that is, the difference between 1.0 and the next value representable by the floating-point type T. |
// Results should be bit-identical. ASSERT_TRUE( memcmp( B_ref.data_ptr<float>(), B_t.data_ptr<float>(), B_ref.nbytes()) == 0); |
// Results should be bit-identical. ASSERT_TRUE(torch::allclose( B_t, B_ref, /*rtol=*/eps, /*atol=*/0.0f, /*equal_nan=*/true)) << "Input[:8]\n" << A_t.index({Slice(0, 8)}) << "\n" << "Test[:8]\n" << B_t.index({Slice(0, 8)}) << "\n" << "Ref[:8]\n" << B_ref.index({Slice(0, 8)}) << diffs(B_t, B_ref); |
N/A | log | log approximation |
36 | PyTorch | 2e35fe953553247d8a22fc38b039374e426f13b8 | C++ | Speed optimization | inefficient algorithm | low speed of model training | N/A | variational maximum likelihood (VML) is a parametric statistical estimation techniques. VML (Beal, 2003) also referred to as (variational) expectation-maximization (McLachlan and Krishnan, 2007; Barber, 2012), can be considered a semi-Bayesian estimation approach. VML rests on a decomposition of the log marginal likelihood |
FPU only has 1 divider, so FP division operations are slow. | tensor math | tensor math | log approximation, tensor math | use a different algorithm | use a different algorithm | Implement variational maximum likelihood for log approximation, which will be faster. It increase speed by keeping floating point units busy by avoiding division operations to to allow for better instruction level pararellism. Use a power series using log vml instead of sleef. | N/A | // Generate every single-precision FP value in [1.0, 2.0). + auto eps = std::numeric_limits<float>::epsilon(); + at::Tensor A_t = torch::arange(1.0f, 2.0f, eps); + ASSERT_EQ(A_t.numel(), 1 << 23); + + test(A_t); + + test(A_t * 2.0f); + test(A_t * 0.5f); + + test(A_t * 4.0f); + test(A_t * 0.25f); + + test(A_t * powf(2.0f, 16)); + test(A_t * powf(2.0f, -16)); + + test(A_t * powf(2.0f, 126)); + test(A_t * powf(2.0f, -126)); + + test(torch::full({32}, INFINITY)); + test(torch::full({32}, NAN)); + + auto min = std::numeric_limits<float>::min(); + auto denorm_min = std::numeric_limits<float>::denorm_min(); + + // Denormals aren't bit precise, because sleef isn't bit-precise either. + A_t = torch::arange(0.0f, min, denorm_min); + ASSERT_EQ(A_t.numel(), 1 << 23); + auto B_ref = at::log(A_t); + auto B_t = at::empty_like(B_ref); + cg.call({A_t.data_ptr<float>(), B_t.data_ptr<float>(), A_t.numel()}); + ASSERT_TRUE(torch::allclose(B_t, B_ref)); +} |
log | log | |
37 | PyTorch | 1047957831e2ef68d60af90865187e46ba6e5e86 | C++ | Speed optimization | inefficient algorithm | low speed of model training | N/A | SLEEF stands for SIMD Library for Evaluating Elementary Functions. It implements manually vectorized versions of all C99 real floating point math functions. It can utilize SIMD instructions that are available on modern processors. SLEEF is designed to efficiently perform computation with SIMD instructions by reducing the use of conditional branches and scatter/gather memory access. | Log can be slow to compute, an optimized algorithm can help. | tensor math | tensor math | log approximation, tensor math | use a different algorithm | use a different algorithm | add log approximation based on SLEEF. | N/A | log | |||
38 | PyTorch | 2572d7a67123fdccef8979520be335c95605cf82 | Python | Unit test | loss of precision | Inaccurate result | Inexact | PyTorch provides two different modes of quantization: Eager Mode Quantization and FX Graph Mode Quantization. Eager Mode Quantization is a beta feature. User needs to do fusion and specify where quantization and dequantization happens manually, also it only supports modules and not functionals. https://pytorch.org/docs/stable/quantization.html quantization aware training (weights quantized, activations quantized, quantization numerics modeled during training) | Needed a unit test for leaky relu in quantization aware training | quantization | quantization | testing, quantization, leaky relu, eger mode quantization, qat conversion, quantization aware training | add test/warning | add precision test | Add numerical test for conversion in qat (Quantization-aware training) for leaky relu | def _test_activation_impl( self, float_module, float_op, quantized_module, quantized_op): ''' Test for activation op(with inplace options), float_op can be torch op or functional op ''' class M(torch.nn.Module): def __init__(self, is_module, inplace): super(M, self).__init__() self.is_module = is_module self.inplace = inplace if self.is_module: self.op = float_module(self.inplace) else: self.op = float_op def forward(self, input): if self.is_module: return self.op(input) else: return self.op(input, self.inplace) options = itertools.product([True, False], [True, False], self.static_quant_types) quantized_nodes = { # is_module True: ns.call_module(quantized_module), False: ns.call_function(quantized_op), } for is_module, is_inplace, quant_type in options: self.checkGraphModeFxOp( M(is_module, is_inplace), self.img_data_2d, quant_type, quantized_nodes[is_module]) |
class TestEagerModeQATOps(QuantizationTestCase): + def _test_activation_convert_numerics_impl(self, Act, data): class M(torch.nn.Module): def __init__(self): super().__init__() @@ -1321,6 +1321,10 @@ class TestEagerModeQATOps(QuantizationTestCase): m = convert(m) checkNoFQModule(m) class TestQATActivationOps(QuantizationTestCase): def _test_activation_convert_numerics_impl(self, Act, data): class M(torch.nn.Module): def __init__(self): super().__init__() self.act = Act() self.quant = QuantStub() self.dequant = DeQuantStub() def forward(self, x): x = self.quant(x) x = self.act(x) x = self.dequant(x) return x m = M().train() m.qconfig = default_qat_qconfig m = prepare_qat(m) before_convert = m(data) m = convert(m) after_convert = m(data) self.assertEqual(before_convert, after_convert) + def test_leaky_relu(self): + data = torch.randn(1, 3, 2, 4) + self._test_activation_convert_numerics_impl(nn.LeakyReLU, data) |
LeakyReLU(x)=max(0,x)+negative_slope∗min(0,x) | leaky relu | |
39 | PyTorch | c9a8413306312b2f2789dd46d5ac1a947be6b556 | Cuda | Fix | loss of precision | NaN, Inf gradients | Creating and using character or word embeddings is the mainstream approach for handling most of the NLP tasks. Each character/word is matched with a numeric vector to create a numerical vector representation of text, which can be input into a model. | Intermediate calculations were done on the same type as the output, in the case of float 16 this can lead to loss of precision. During FP16 training, char_embeddings.weight get NAN or INF gradients | other | NLP | backward pass, character embedding, NLP | increase variable precision/change variable type | increase variable precision | use higher precision for a variable that holds intermediate result, use a `float32` temporary tensor when the input is `float16` | ||||||
40 | PyTorch | 699de487db9f2cb6de5cba9588311eed46a8ccb3 | C++ | New feature | N/A | trapezoidal rule for integration is an approximation technique for calculating area under a curve based on summing trapezoids under a curve The estimated integral of a function y of x, sampled at points (y_1, ..., y_n) that are separated by distance (dx_1, ..., dx_{n-1}), is given by the trapezoid rule: sum_{i=1}^{n-1} dx_i * (y_i + y_{i+1}) / 2 |
N/A | other | integration | integration | other | add new algorithm | add numerical integration based on trapeizoidal rule that matches numpy implementation | N/A | Tensor do_trapz(const Tensor& y, const Tensor& dx, int64_t dim) { + Tensor left = y.slice(dim, 0, -1); + Tensor right = y.slice(dim, 1); + + return ((left + right) * dx).sum(dim) / 2.; +} + +// When dx is constant, the above formula simplifies +// to dx * [(\sum_{i=1}^n y_i) - (y_1 + y_n)/2] +Tensor do_trapz(const Tensor& y, double dx, int64_t dim) { + return (y.sum(dim) - (y.select(dim, 0) + y.select(dim, -1)) * (0.5)) * dx; +} + +Tensor zeros_like_except(const Tensor& y, int64_t dim) { + auto sizes = y.sizes().vec(); + dim = maybe_wrap_dim(dim, y.dim()); + sizes.erase(sizes.begin() + dim); + return at::zeros(sizes, y.options()); +} + +} + +Tensor trapz(const Tensor& y, const Tensor& x, int64_t dim) { + dim = maybe_wrap_dim(dim, y); + // asking for the integral with zero samples is a bit nonsensical, + // but we'll return "0" to match numpy behavior. + if (y.size(dim) == 0) { + return zeros_like_except(y, dim); + } + Tensor x_viewed; + if (x.dim() == 1) { + TORCH_CHECK(x.size(0) == y.size(dim), "trapz: There must be one `x` value for each sample point"); + DimVector sizes(y.dim(), 1); + sizes[dim] = x.size(0); + x_viewed = x.view(sizes); + } else { + x_viewed = x; + } + Tensor x_left = x_viewed.slice(dim, 0, -1); + Tensor x_right = x_viewed.slice(dim, 1); + + Tensor dx = x_right - x_left; + return do_trapz(y, dx, dim); +} + +Tensor trapz(const Tensor& y, double dx, int64_t dim) { + // see above + if (y.size(dim) == 0) { + return zeros_like_except(y, dim); + } + return do_trapz(y, dx, dim); |
def test_trapz(self): + f_args_variable = (torch.randn(2, 3, requires_grad=True), + torch.tensor([[1.0, 2.0, 5.5], [2.3, 0.5, 6.2]], requires_grad=True)) + f_args_tensor = deepcopy(unpack_variables(f_args_variable)) + run_functional_checks(self, "test_trapz", "trapz", + lambda y, x: torch.trapz(y, x), + True, f_args_variable, f_args_tensor) @unittest.skipIf(not TEST_NUMPY, "Numpy not found") + def test_trapz(self): + def test_dx(sizes, dim, dx, device): + t = torch.randn(sizes, device=device) + actual = torch.trapz(t, dx=dx, dim=dim) + expected = np.trapz(t.cpu().numpy(), dx=dx, axis=dim) + self.assertEqual(expected.shape, actual.shape) + self.assertTrue(np.allclose(expected, actual.cpu().numpy())) + + def test_x(sizes, dim, x, device): + t = torch.randn(sizes, device=device) + actual = torch.trapz(t, x=torch.tensor(x, device=device), dim=dim) + expected = np.trapz(t.cpu().numpy(), x=x, axis=dim) + self.assertEqual(expected.shape, actual.shape) + self.assertTrue(np.allclose(expected, actual.cpu().numpy())) + + for device in torch.testing.get_all_device_types(): + test_dx((2, 3, 4), 1, 1, device) + test_dx((10, 2), 0, 0.1, device) + test_dx((1, 10), 0, 2.3, device) + test_dx((0, 2), 0, 1.0, device) + test_dx((0, 2), 1, 1.0, device) + test_x((2, 3, 4), 1, [1.0, 2.0, 3.0], device) + test_x((10, 2), 0, [2.0, 3.0, 4.0, 7.0, 11.0, 14.0, 22.0, 26.0, 26.1, 30.3], device) + test_x((1, 10), 0, [1.0], device) + test_x((0, 2), 0, [], device) + test_x((0, 2), 1, [1.0, 2.0], device) + with self.assertRaisesRegex( + IndexError, + 'Dimension out of range'): + test_x((2, 3), 2, [], device) + test_dx((2, 3), 2, 1.0, device) + with self.assertRaisesRegex( + RuntimeError, + 'There must be one `x` value for each sample point'): + test_x((2, 3), 1, [1.0, 2.0], device) + test_x((2, 3), 1, [1.0, 2.0, 3.0, 4.0], device) |
y = 1/(1+exp(-x)), x = logit(y) | integration | ||
41 | PyTorch | c5d5d45f40969cbddbb7f87da343dfd422503c1c | Python | Fix | overflow/underflow | overflow/underflow | The absolute value of the Jacobian determinant at p gives us the factor by which the function f expands or shrinks volumes near p; this is why it occurs in the general substitution rule. The Jacobian determinant is used when making a change of variables when evaluating a multiple integral of a function over a region within its domain. According to the inverse function theorem, the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function. |
The absolute determinant of the Jacobian of the inverse transformation in sigmoid transformation is unstable and returns NaN | statistical distributions | statistical distributions | Log absolute determinant Jacobian, distribution transformation | rewrite math formula | rewrite math formula | Rewrite method log abs det jacobian | -(y.reciprocal() + (1 - y).reciprocal()).log() | -F.softplus(-x) - F.softplus(x) | reciprocal | |||
42 | PyTorch | 645ad7ad0c89ecef61e89666745324deba31c8b7 | Python | Fix | underflow | NaN | LP is LP space: At p = 1, one gets Sum Pooling (which is proportional to average pooling), p = inf is max pooling | Gradient in LP pooling 1D and 2D becomes Nan when all inputs are zero. If all inputs are zero then the sum of all x to the power of p is zero. Square root of zero = NaN | CNN operations | pooling layer | LP pooling | rewrite math formula | rewrite math formula | Add relu unit to LP pooling to avoid gradient = NaN. After adding this patch gradient will be set to zero as opposed to NaN. | return out.mul(kw * kh).pow(1. / norm_type) | return (torch.sign(out) * relu(torch.abs(out))).mul(kw * kh).pow(1. / norm_type) | pth root of sum of x^p | pth root of sum of polynomials | ||
43 | PyTorch | de42542351ad933ada59a4a8cf3b247d75d52917 | Python | Fix | loss of precision | Precision matrix (also known as concentration matrix) is the matrix inverse of the covariance matrix. The multivariate normal distribution can be parametrized either by the covariance matrix or precision matrix. | precision matrix computation in multivariate normal distribution is unstable due to matrix inverse | statistical distributions | statistical distributions | distributions, precision matrix, multivariante normal distribution | rewrite math formula | rewrite math formula | Prior computation for precision matrix which uses the inverse of covariance matrix. Compute precision matrix with scale_tril instead. scale_tril is lower-triangular k x k matrix with non-zero diagonal, | - flat_conv = self.covariance_matrix.reshape((-1,) + self._event_shape * 2) - flat_precision = torch.stack([C.inverse() for C in flat_conv], 0) |
scale_tril_inv = _batch_inverse(self.scale_tril) + flat_scale_tril_inv = self.scale_tril.reshape((-1,) + self._event_shape * 2) + flat_precision = torch.bmm(flat_scale_tril_inv.transpose(-1, -2), + flat_scale_tril_inv) |
matrix inverse | ||||
44 | PyTorch | 8cff8e93d21142ff42b9d2b1f45b01acde0b9d99 | Python | Fix | loss of precision | NaN | Pytorch no longer has from torch.distributions.utils import _finfo, there is now torch.finfo A torch.finfo is an object that represents the numerical properties of a floating point torch.dtype, (i.e. torch.float32, torch.float64, and torch.float16). This is similar to numpy.finfo. |
Need a function for checking numerical properties of variables and calculating epsilon, which is used, for example, in softmax. Different floating point types have different characteristics with regards to their precision, what is the smallest positive number they can represent, what is the smallest number that can be added to one without truncation, etc. | statistical distributions | statistical distributions | distributions (Laplace, Gumbel, Gamma, Dirichlet) | use a different algorithm | use a different algorithm | Pytorch has many differenet datatypes with varying degress of precision. _finfo allows one to get the information about the charactaristics such as smallest number that can be added to 1 without truncation (eps), and the smallest positive number greater than zero (tiny) for each type of float. The newly implemented _finfo is used to clamp the Gamma, Beta, and Dirichlet distributions to avoid NANs. | def _get_clamping_buffer(tensor): - clamp_eps = 1e-6 - if isinstance(tensor, Variable): - tensor = tensor.data - if isinstance(tensor, (torch.DoubleTensor, torch.cuda.DoubleTensor)): - clamp_eps = 1e-15 - return clamp_eps eps = _get_clamping_buffer(probs) |
# This follows semantics of numpy.finfo. +_Finfo = namedtuple('_Finfo', ['eps', 'tiny']) +_FINFO = { + torch.HalfStorage: _Finfo(eps=0.00097656, tiny=6.1035e-05), + torch.FloatStorage: _Finfo(eps=1.19209e-07, tiny=1.17549e-38), + torch.DoubleStorage: _Finfo(eps=2.22044604925e-16, tiny=2.22507385851e-308), + torch.cuda.HalfStorage: _Finfo(eps=0.00097656, tiny=6.1035e-05), + torch.cuda.FloatStorage: _Finfo(eps=1.19209e-07, tiny=1.17549e-38), + torch.cuda.DoubleStorage: _Finfo(eps=2.22044604925e-16, tiny=2.22507385851e-308), +} _finfo doc comment: def _finfo(tensor): """ Return floating point info about a `Tensor` or `Variable`: - `.eps` is the smallest number that can be added to 1 without being lost. - `.tiny` is the smallest positive number greater than zero (much smaller than `.eps`). Args: tensor (Tensor or Variable): tensor or variable of floating point data. Returns: _Finfo: a `namedtuple` with fields `.eps` and `.tiny`. """ eps = _finfo(probs).eps |
||||
45 | PyTorch | bc505100167f61ce241f511741794dfe2f89c5f0 | Python | Fix | loss of precision | Logit is the natural logarithm of odds, which is defined as p / (1-p), where p is probability. Probabilities range from zero to one, i.e., p∈[0,1], whereas logits can be any real number (R, from minus infinity to infinity) | numerical stability of linspace implementation | loss functions | loss functions | loss, caffe2, batch lr loss | use a different algorithm | use a different algorithm | Delete code that uses probability, use only logits in batch lr loss | - if schema.is_schema_subset( - schema.Struct( - ('label', schema.Scalar()), - ('logit', schema.Scalar()) - ), self.input_record - ): - label = self.input_record.label() - # mandatory cast to float32 - # self.input_record.label.field_type().base is np.float32 but - # label type is actually int - label = net.Cast( - label, - net.NextScopedBlob('label_float32'), - to=core.DataType.FLOAT) - label = net.ExpandDims(label, net.NextScopedBlob('expanded_label'), - dims=[1]) - xent = net.SigmoidCrossEntropyWithLogits( - [self.input_record.logit(), label], - net.NextScopedBlob('cross_entropy'), - ) - # TODO(T23937449): Change all the use cases of BatchLRLoss to the - # numerically stable version - else: - class_probabilities = net.MakeTwoClass( - self.input_record.prediction.field_blobs(), - net.NextScopedBlob('two_class_predictions') - ) - label = self.input_record.label.field_blobs() - label = [net.Cast( - label, - net.NextScopedBlob('int32_label'), - to=core.DataType.INT32)] - xent = net.LabelCrossEntropy( - [class_probabilities] + label, - net.NextScopedBlob('cross_entropy'), - ) |
label = self.input_record.label() + # mandatory cast to float32 + # self.input_record.label.field_type().base is np.float32 but + # label type is actually int + label = net.Cast( + label, + net.NextScopedBlob('label_float32'), + to=core.DataType.FLOAT) + label = net.ExpandDims(label, net.NextScopedBlob('expanded_label'), + dims=[1]) + xent = net.SigmoidCrossEntropyWithLogits( + [self.input_record.logit(), label], + net.NextScopedBlob('cross_entropy'), + ) |
logit = ln(p/(1-p)) | ln | |||
46 | PyTorch | 40b783b746b4f5775c97c7fe41dfb011b545665a | Python | Unit test | loss of precision | A simple approximation of the first derivative is f'(x) ~ (f(x+h)-f(x))/h, where h is the steps size. | Unit test failing because of numerical approximation of derivative (i.e.: the gradient) of pReLU uses step size that is too large, which causes a large approximation error. | activation functions | activation functions | testing accuracy, gradients, caffe2, pReLU | rewrite math formula | rewrite math formula | Improve test for gradient checks asserts by using smaller step size | self.assertGradientChecks(gc, op, [X, W], 0, [0]) | self.assertGradientChecks(gc, op, [X, W], 0, [0], stepsize=1e-2) | def test_prelu(self, X, alpha, inplace, shared, order, seed, gc, dc): np.random.seed(seed) W = np.random.randn( X.shape[1] if order == "NCHW" else X.shape[3]).astype(np.float32) if shared: W = np.random.randn(1).astype(np.float32) # go away from the origin point to avoid kink problems X += 0.04 * np.sign(X) X[X == 0.0] += 0.04 def prelu_ref(X, W): Y = X.copy() W = W.reshape(1, -1, 1, 1) if order == "NCHW" \ else W.reshape(1, 1, 1, -1) assert len(X.shape) == 4 neg_indices = X <= 0 assert len(neg_indices.shape) == 4 assert X.shape == neg_indices.shape Y[neg_indices] = (Y * W)[neg_indices] return (Y,) op = core.CreateOperator( "PRelu", ["X", "W"], ["Y" if not inplace else "X"], alpha=alpha, order=order) self.assertReferenceChecks(gc, op, [X, W], prelu_ref, ensure_outputs_are_inferred=True) # Check over multiple devices self.assertDeviceChecks(dc, op, [X, W], [0]) if not inplace: # Gradient check wrt X self.assertGradientChecks(gc, op, [X, W], 0, [0], stepsize=1e-2, ensure_outputs_are_inferred=True) # Gradient check wrt W self.assertGradientChecks(gc, op, [X, W], 1, [0], stepsize=1e-2, ensure_outputs_are_inferred=True) |
PReLU(x)=max(0,x)+a∗min(0,x) | pReLU | ||
47 | PyTorch | e187ba7a9fb18aba0a0651e05c20e1f491d989fc | Python | Fix | loss of precision | inaccurate result | Inexact | Fmod computes the element-wise remainder of division. When the divisor is zero, returns NaN for floating point dtypes on both CPU and GPU; raises RuntimeError for integer division by zero on CPU; Integer division by zero on GPU may return any value. | Unit test for Fmod/Remainder fail due to numerical jacobian check. Previously, tests for Fmod and Remainder added 5e-2 to the denominator tensor (the same as the div tests), which only avoids divide by 0, but not issues with computing the numerical jacobian due to non-linearity of fmod/remainder, when input / divisor is close to an integer. | gradients/derivatives | automatic differentiation | testing accuracy, automatic differentiation, remainer, numerical jacobian | rewrite math formula | rewrite math formula | Ensure that the result of input / divisor is not close to an integer. Add 1.5 to denominator to make it more likely that it it will not be an integer result . Add 1.5 to denominator instead of 5e-2. Note: this is no longer in Pytorch, specifically decrease probability of numerical issues with numerical jacobian computation. | remainder | ||||
48 | PyTorch | 67968cb60b1d3021834594967d4140a36a8213e3 | Python | Fix | overflow/loss of precision | Binary cross entropy with logits measures the probability error in tasks with two outcomes in which each outcome is independent and need not have a fully certain label. For instance, one could perform a regression where the probability of an event happening is known and used as a label. This loss may also be used for binary classification, where labels are either zero or one. | Using sigmoid followed by a BCE loss layer can be less stable than a single layer that combines sigmoid with BCE Loss | loss functions | loss functions | binary cross entropy loss | use a different algorithm | use a different algorithm | combinine sigmoid and BCE loss into one layer and utilize the log-sum-exp trick. This is more stable than using a plain sigmoid followed by a BCE loss | def binary_cross_entropy_with_logits(input, target, weight=None, size_average=True): + r"""Function that measures Binary Cross Entropy between target and output logits: + + See :class:`~torch.nn.BCEWithLogitsLoss` for details. + + Args: + input: Variable of arbitrary shape + target: Variable of the same shape as input + weight (Variable, optional): a manual rescaling weight + if provided it's repeated to match input tensor shape + size_average (bool, optional): By default, the losses are averaged + over observations for each minibatch. However, if the field + sizeAverage is set to False, the losses are instead summed + for each minibatch. + """ + if weight is not None and target.dim() != 1: + weight = weight.view(1, target.size(1)).expand_as(target) + neg_abs = - input.abs() + loss = input.clamp(min=0) - input * target + (1 + neg_abs.exp()).log() + + if weight is not None: + loss = loss * weight + + if size_average: + return loss.mean() + else: + return loss.sum() |
loss(o, t) = - 1/n \sum_i (t[i] * log(sigmoid(o[i])) + (1 - t[i]) * log(1 - sigmoid(o[i]))) | log, sigmoid | ||||
49 | PyTorch | 7ba5e7cea1d2be485d2806ad38608dad9bcc7041 | Python | Fix | loss of precision | Pooling layers are used to reduce the dimensions of the feature maps and to summarize them. A max pooling layer returns the maximum values of rectangular regions of its input. Boundary conditions (b.c.) are constraints necessary for the solution of a boundary value problem. A boundary value problem is a differential equation (or system of differential equations) to be solved in a domain on whose boundary a set of conditions is known. |
VolumetricMaxPooling (in legacy.nn) precision test kept failing there were these one set of indices that were in the same Pooling window that differed by less than epsilon. So, the numeric gradient was hitting boundary conditions (max-pooling is discontinuous of course) |
CNN operations | pooling layer | testing, max pooling | rewrite math formula | rewrite math formula | modify the test to not have the input tensor have these boundary conditions, using torch.rand, which eturns a tensor filled with random numbers from a uniform distribution on the interval [0, 1) and then multipling by 1000 | input_size=(2, 3, 5, 5, 5)) | input=(torch.randn(2, 3, 5, 5, 5) * 1000)), | |||||
50 | PyTorch | a03692069ebe19038bfccf5a59208ed2989bd4d9 | Python | Unit test | loss of precision | Unit test sometimes failing because of numerical gradient approximation error | loss functions | loss functions | caffe2, loss | relax accuracy test tolerance | relax accuracy test tolerance | Increase the tollerance when comparing the gradient to make test pass | delta=1e-3 | delta=1e-2 * abs(np.asscalar(dx[0]))) | ||||||
51 | PyTorch | 33cc71dc55db073ba46b065e24cff0d26156376f | C | Fix | loss of precision | Returns a 1-D tensor of size (start-end)/step + 1 with values from start to end with step step. Step is the gap between two values in the tensor. | Precision can be lost when floats get very small, unexpected behavior | tensor math | tensor math | range (i.e.: 1 D tensor) | rewrite math formula | rewrite math formula | When dividing (xmax - xmin) by step, the numerator can become very small of xmax and xmin are close to each other. It is mathematically correct and also more stable when xmax and xmin are close to each other to distribute the divisions, then perform subtraction, therefore (xmax / step) - (xmin / step). Note: This function is deprecated and will be removed in a future release because its behavior is inconsistent with Python’s range builtin. Instead, use torch.arange(), which produces values in [start, end). | void THTensor_(range)(THTensor *r_, real xmin, real xmax, real step) size = (long)((xmax-xmin)/step+1); |
void THTensor_(range)(THTensor *r_, accreal xmin, accreal xmax, accreal step) size = (long)((xmax/step - xmin/step)+1); |
|||||
52 | PyTorch | 87fcf3072ef988b5b2e408cce141b76235929bbd | C++ | Fix | overflow | Hsum_sq performs horizontal sum of squares over a range of uint8_t, returns row sum |
The quantized version of hsum_sq has an overflow when input image size is large such as (H,W,D) as (224,224,160) | quantization | quantization | quantization, sum of squares | use a different algorithm | use a different algorithm | Rewrite for loop definition to include overflow threshold to prevent overflow | for (; i < len / 16 * 16; i += 16) { } |
int overflow_threshold = 262144; // 2147483647(max of int32)/(256*256)*8 = 262144 int loop = len / overflow_threshold + 1; for(int j=0; j<=loop; j++){ for (; ((i < overflow_threshold * j) && (i < len / 16 * 16)); i += 16) { |
|||||
53 | PyTorch | 45aaaef22cdc9d87f2c04762fce9ffeeff290330 | Python | Unit test | overflow | exception | Python uses arbitrary precision integers, which can scale to be as large as needed, up to the amount of memory available to the computer. C++ primitives are fixed in their precision, and are commonly either 8, 16, 32, or 64 bits. | A timing function in code used for benchmarking has can overflow when calling C++ code | precision tests/speed benchmarks | timing | benchmarking timing | add overflow check | add overflow check | Check if the operation would overflow a 32 bit signed primitive from Python before using this value in C++. Add a break statement with overflow threshold condition to prevent overflow | # Avoid overflow in C++ pybind11 interface + if number * 10 > 2147483647: + break |
|||||
54 | PyTorch | c675727adf36bdbb60933c9c7529d3ee34462093 | C++ | Fix | overflow | torch.empty(size) returns a tensor filled with uninitialized data. The shape of the tensor is defined by the variable argument size | Incorrect error message that fails to indicate an overflow. Overflow occurs when the input into torch.empty is very large | tensor math | tensor math | torch.empty (i.e.: tensor with unitialized data) | fix test/warning | correct error message | change error message to indicate overflow | TypeError: empty(): argument 'size' must be tuple of ints, but found element of type int at pos 1 | RuntimeError: Overflow when unpacking long | |||||
55 | PyTorch | a69910868a5962e2d699c6069154836e262a29e2 | Python | Fix | overflow | DistributedSampler restricts data loading to a subset of the dataset. DistributedSampler takes a dataset as input and loads a sample of it. torch.utils.data.distributed.DistributedSampler(dataset, num_replicas=None, rank=None, shuffle=True, seed=0, drop_last=False) num_replicas (int, optional): Number of processes participating in distributed training. By default, :attr:`world_size` is retrieved from the current distributed group. |
DistributedSampler takes a dataset as input and loads a sample of it. When `len(dataset) * 2 < num_replica`, there is a possibility of overflow |
statistical distributions | data sampling | distributions, sampling, data loading | rewrite math formula | rewrite math formula | rewrite formula for indexing data points in dataset and add if else logic | indices += indices[:(self.total_size - len(indices))] | padding_size = self.total_size - len(indices) + if padding_size <= len(indices): + indices += indices[:padding_size] + else: + indices += (indices * math.ceil(padding_size / len(indices)))[:padding_size] |
|||||
56 | PyTorch | 6debe825beb36fc8e894a1b0a14bd5b4ebcd6090 | GLSL, Python, C++ | New feature | loss of precision | Vulcan is a graphics and compute open standard API. GLSL (OpenGL Shading Language), a special OpenGL Shading Language with syntax similar to C. A shader is essentially a function required to draw something on the screen. Shaders run on a GPU. The RelaxedPrecision allows 32-bit integer and 32-bit floating-point operations to execute with a relaxed precision of somewhere between 16 and 32 bits. More info: https://www.khronos.org/registry/spir-v/specs/1.0/SPIRV.html |
Add new feature to allow relaxed precision mode via a cmake option | non-standard precision | non-standard precision | GLSL shaders, GPU | add new precision option | add new precision option | Introduces cmake option USE_VULKAN_RELAXED_PRECISION that controls which precision will be used in Vullkan shaders. This option allows to relax precision executes operations in 16 to 32 bit range precision on Vulcan. Note, the default setting is 32 bit precision. | N/A | option(USE_VULKAN_RELAXED_PRECISION "Use Vulkan relaxed precision(mediump)" OFF) +if(USE_VULKAN_RELAXED_PRECISION) + string(APPEND CMAKE_CXX_FLAGS " -DUSE_VULKAN_RELAXED_PRECISION") +endif() |
|||||
57 | PyTorch | 324c18fcad579b1afa63ae45528bf598ba8ec4ca | Cuda | Fix | underflow | Computes division a/b using formula a * (1/b) | Division operation, where the denominator is a low precision scalar has a risk of underflow. Inverse by division was calculated using the same precision of the non-scalar operands. | tensor math | tensor math | Cuda, division | increase variable precision/change variable type | change variable type | Replace the type used for accumulation to the same type as the opperands. Replace scalar_t with accscalar_t. | auto inv_b = scalar_t(1.0) / iter.scalar_value<scalar_t>(2); | using accscalar_t = at::acc_type<scalar_t, true>; auto inv_b = accscalar_t(1.0) / iter.scalar_value<accscalar_t>(2); |
@onlyCUDA + @dtypes(torch.half) + def test_divmul_scalar(self, device, dtype): + x = torch.tensor(100., device=device, dtype=dtype) + x_ref = x.float() + scale = 1e5 + res = x.div(scale) + expected = x_ref.div(scale) + self.assertEqual(res, expected.to(dtype), atol=0., rtol=0.) + x = torch.tensor(1e-5, device=device, dtype=dtype) + x_ref = x.float() + res = x.mul(scale) + expected = x_ref.mul(scale) + self.assertEqual(res, expected.to(dtype), atol=0., rtol=0.) + res = scale * x + self.assertEqual(res, expected.to(dtype), atol=0., rtol=0.) |
division | |||
58 | PyTorch | 24a8614cac3af1711eccc7294fd47ac30aefa8cc | Python | Add warning | overflow | cuFFT = CUDA Fast Fourier Transform library | non-standard precision | non-standard precision | CUDA, half precision, warning | disable test/warning | add overflow warning | Add a warning message to warn programmer of possible overflow when operation performed in half precision. Message: "Due to limited dynamic range of half datatype, performing this operation in half precision may cause the first element of result to overflow for certain inputs" | ||||||||
59 | PyTorch | fe684679b06f7f2fe7a7e136ea5605c04254b652 | C++ | disable test | overflow | runtime error | The csrc directory contains all of the code concerned with integration with Python. This is in contrast to lib, which contains the Torch libraries that are Python agnostic. csrc depends on lib, but not vice versa. | Runtime error from overflow when unpacking large numbers. The bug is: torch.tensor([0.1, 999999999999999999999]) fails with "Overflow when unpacking double" | other | other | Convert Python float to C++ float, Python integration | disable test/warning | disable overflow and precision test | Delete code that throws an exception on overflow and lost precision | - if (PyLong_Check(obj)) { - int overflow; - long long value = PyLong_AsLongLongAndOverflow(obj, &overflow); - if (overflow != 0) { - throw std::runtime_error("Overflow when unpacking double"); - } - if (value > DOUBLE_INT_MAX || value < -DOUBLE_INT_MAX) { - throw std::runtime_error("Precision loss when unpacking double"); - } - return (double)value; - } |
delete old solution | def test_unpack_double(self, device, dtype): + # Reference: https://github.com/pytorch/pytorch/issues/33111 + vals = (2 ** 24 + 1, 2 ** 53 + 1, + np.iinfo(np.int64).max, np.iinfo(np.uint64).max, np.iinfo(np.uint64).max + 1, + -1e500, 1e500) + for val in vals: + t = torch.tensor(val, dtype=dtype, device=device) + a = np.array(val, dtype=torch_to_numpy_dtype_dict[dtype]) + self.assertEqual(t, torch.from_numpy(a)) |
|||
60 | PyTorch | 7417b4c66f5b0901f206bf48b64de07384770724 | Cuda | Fix | overflow | ConvTranspose3d applies a 3D transposed convolution operator over an input image composed of several input planes. The transposed convolution operator multiplies each input value element-wise by a learnable kernel, and sums over the outputs from all input feature planes. torch.nn.ConvTranspose3d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros') |
The index in torch.nn.ConvTranspose3d overflows | CNN operations | convolution | convolution transpose | add test/warning | change variable type, increase variable precision | Change variable type of index from int to unsigned. equires that input.numel() <= UINT_MAX, and channels * kernel.numel() <= UINT_MAX. Note: this is a second attept to fix the problem | int data_col_index = - (((((c_im * kernel_t + t_k) * kernel_h + h_k) * kernel_w + - w_k) * - depth_col + - t_col) * - height_col + - h_col) * - width_col + - w_col; |
const int64_t idx_k = + ((c_im * kernel_t + t_k) * kernel_h + h_k) * kernel_w + w_k; + const int64_t data_col_index = + ((idx_k * depth_col + t_col) * + height_col + h_col) * + width_col + w_col; val += data_col[data_col_i |
const auto num_kernels = channels * depth * height * width; + + auto check_fits_in_unsigned = + [](int64_t val, const char * name) { + constexpr auto umax = std::numeric_limits<unsigned>::max(); + TORCH_CHECK(val >= 0 && val <= umax, + name, " must fit in a 32-bit unsigned value"); + }; + check_fits_in_unsigned(num_kernels, "input size"); + check_fits_in_unsigned( + channels * patch_t * patch_h * patch_w, "channels x kernel size"); |
||||
61 | PyTorch | 0a159b0a3a78a80fb0f9082087a98f87f2dea986 | C++ | Fix | loss of precision | inaccurate/incorrect result | torch.remainder gives the wrong output for very large float dividends due to loss of precision. For example, x = torch.tensor(2749682432.0) q = 36 print(torch.remainder(x,q)) actual output is 128.0 whereas the correct output should be 20 |
tensor math | tensor math | remainder | use a different algorithm | use a different algorithm | Use sleef library to calculate mod for floats. Use sleef_fmod8, a vectorized single precision FP remainder. | return a - b * at::native::floor_impl(a / b); | Vec256<BFloat16> fmod(const Vec256<BFloat16> & q) const { + __m256 x_lo, x_hi; + cvtbf16_fp32(values, x_lo, x_hi); + __m256 q_lo, q_hi; + cvtbf16_fp32(q.values, q_lo, q_hi); + auto o1 = Sleef_fmodf8(x_lo, q_lo); + auto o2 = Sleef_fmodf8(x_hi, q_hi); + return cvtfp32_bf16(o1, o2); scalar_t mod = std::fmod(a, b); if ((mod != 0) && ((b < 0) != (mod < 0))) mod += b; return mod; |
modulo | ||||
62 | PyTorch | 63b1ae69831cd21bc4d6059a5854bc1155a152c9 | Cuda | Fix | overflow | C++ std:: fmod definition: The floating-point remainder of the division operation x/y calculated by this function is exactly the value x - n*y, where n is x/y with its fractional part truncated. The returned value has the same sign as x and is less than y in magnitude. If successful, returns the floating-point remainder of the division x/y as defined above. If a domain error occurs, an implementation-defined value is returned (NaN where supported) If a range error occurs due to underflow, the correct result (after rounding) is returned. |
overflow in torch.remainder when dividend is very large | tensor math | tensor math | remainder | rewrite math formula | rewrite math formula | Use fmod from C++ standard library to calculate remainder instead of a - b * floor(a/b). And account for an edge case: if the result of fmod is not zero (i.e.; a is not divisible by b) and either (1) the divisor is less than zero while the remainder is greater than zero, or (2) the divisor is greater than zero while the remainder is less than zero. If that is the case, increment the result of fmod by the divisor. | return a - b * static_cast<scalar_t>(std::floor(a / b)); | auto mod = ::fmod(a, b); + if ((mod != 0) && ((b < 0) != (mod < 0))) mod += b; + return mod; |
def test_remainder_fmod_large_dividend(self, device, dtype): + alarge = 1e9 + pi = 3.14159265358979 + for avalue in [alarge, -alarge]: + for bvalue in [pi, -pi]: + a = torch.tensor([avalue], dtype=dtype, device=device) + b = torch.tensor([bvalue], dtype=dtype, device=device) + c = torch.remainder(a, b) + d = torch.fmod(a, b) + self.assertTrue((b[0] > 0) == (c[0] > 0)) # remainder has same sign as divisor + self.assertTrue((a[0] > 0) == (d[0] > 0)) # fmod has same sign as dividend + self.assertTrue(abs(c[0]) < abs(b[0])) # remainder is within range of divisor + self.assertTrue(abs(d[0]) < abs(b[0])) # fmod is within range of divisor + if ((a[0] > 0) == (b[0] > 0)): + self.assertTrue(c[0] == d[0]) # remainder is same as fmod + else: + self.assertTrue(abs(c[0] - d[0]) == abs(b[0])) # differ by one divisor |
remainder, division | |||
63 | PyTorch | b33e38ec475017868534eb114741ad32c9d3b248 | C++ | Fix | loss of precision | arrange creates a 1D tensor using start, end, and step size | Step and input have the same type. Variable step may require higher precision than variables start and end. I think this is when step is a very small number. | tensor math | tensor creation | vectorized calculations, low level tensor math, CPU | increase variable precision/change variable type | increase variable precision | Allow a higher-precision step type for Vec256::arange. Setting the type of step to be independent of the input type. Often a double is required for this while the input remains a single. | static Vec256<T> arange(T base = static_cast<T>(0), T step = static_cast<T>(1)) | template<typename step_t> // step sometimes requires a higher precision type (e.g., T=int, step_t=double) static Vec256<T> arange(T base = static_cast<T>(0), step_t step = static_cast<step_t>(1)) { |
N/A | ||||
64 | PyTorch | 5c423cae72b3b720a0857a8237a499d0e07d6b98 | Python | Unit test | loss of precision | Linspace creates a 1D tensor of size steps whose values are evenly spaced from start to end, inclusive. Logspace creates a 1D tensor of size steps whose values are evenly spaced from base^start to base^end inclusive, on a logarithmic scale with base "base". |
Precision of Cuda half precision computation of linspace and logspace seems bad | tensor math | tensor creation | testing precision, Cuda, half precision, linspace, logspace | add test/warning | add precision test | adds precision tests for CUDA half (16 bits), float (32 bits), and double (64 bits). Since linspace/logspace are deterministic, we can compute an expected amount of error (by testing without a precision override), adding a tiny amount (EPS) to that, and using that value as the override. EPS = 1e-5 |
LINSPACE_LOGSPACE_EXTRA_EPS = 1e-5 + # Tests that compare a device's computation with the (gold-standard) CPU's. class TestDevicePrecision(TestCase): - def test_linspace(self, device): - a = torch.linspace(0, 10, 10, device=device) - b = torch.linspace(0, 10, 10) + + # The implementation of linspace+logspace goes through a different path + # when the steps arg is equal to 0 or 1. For other values of `steps` + # they call specialized linspace (or logspace) kernels. + LINSPACE_LOGSPACE_SPECIAL_STEPS = [0, 1] + + def _test_linspace(self, device, dtype, steps): + a = torch.linspace(0, 10, steps=steps, dtype=dtype, device=device) + b = torch.linspace(0, 10, steps=steps) self.assertEqual(a, b) - @dtypes(torch.double) - def test_logspace(self, device, dtype): - a = torch.logspace(1, 10, 10, dtype=dtype, device=device) - b = torch.logspace(1, 10, 10, dtype=dtype, device='cpu') + # See NOTE [Linspace+Logspace precision override] + @precisionOverride({torch.half: 0.0039 + LINSPACE_LOGSPACE_EXTRA_EPS}) + @dtypesIfCUDA(torch.half, torch.float, torch.double) + @dtypes(torch.float, torch.double) + def test_linspace(self, device, dtype): + self._test_linspace(device, dtype, steps=10) + + @dtypesIfCUDA(torch.half, torch.float, torch.double) + @dtypes(torch.float, torch.double) + def test_linspace_special_steps(self, device, dtype): + for steps in self.LINSPACE_LOGSPACE_SPECIAL_STEPS: + self._test_linspace(device, dtype, steps=steps) + + def _test_logspace(self, device, dtype, steps): + a = torch.logspace(1, 1.1, steps=steps, dtype=dtype, device=device) + b = torch.logspace(1, 1.1, steps=steps) self.assertEqual(a, b) - # Check non-default base=2 - a = torch.logspace(1, 10, 10, 2, dtype=dtype, device=device) - b = torch.logspace(1, 10, 10, 2, dtype=dtype, device='cpu') + def _test_logspace_base2(self, device, dtype, steps): + a = torch.logspace(1, 1.1, steps=steps, base=2, dtype=dtype, device=device) + b = torch.logspace(1, 1.1, steps=steps, base=2) self.assertEqual(a, b) + # See NOTE [Linspace+Logspace precision override] + @precisionOverride({torch.half: 0.0157 + LINSPACE_LOGSPACE_EXTRA_EPS}) + @dtypesIfCUDA(torch.half, torch.float, torch.double) + @dtypes(torch.float, torch.double) + def test_logspace(self, device, dtype): + self._test_logspace(device, dtype, steps=10) + + # See NOTE [Linspace+Logspace precision override] + @precisionOverride({torch.half: 0.00201 + LINSPACE_LOGSPACE_EXTRA_EPS}) + @dtypesIfCUDA(torch.half, torch.float, torch.double) + @dtypes(torch.float, torch.double) + def test_logspace_base2(self, device, dtype): + self._test_logspace_base2(device, dtype, steps=10) + + @dtypesIfCUDA(torch.half, torch.float, torch.double) + @dtypes(torch.float, torch.double) + def test_logspace_special_steps(self, device, dtype): + for steps in self.LINSPACE_LOGSPACE_SPECIAL_STEPS: + self._test_logspace(device, dtype, steps=steps) + self._test_logspace_base2(device, dtype, steps=steps) \ |
start, start + (end-start)/(steps-1), ..., start + (steps - 2) * (end-start)/(steps-1) https://pytorch.org/docs/stable/generated/torch.logspace.html |
|||||
65 | PyTorch | b9b9fd4fadc4d4fa0b030941a35011956eafa10b | C++ | Disable warning | overflow | Warning pragma enables selective modification of the behavior of compiler warning messages. The pragma warning( push ) stores the current warning state for every warning. The pragma warning( push, n ) stores the current state for every warning and sets the global warning level to n. The pragma warning( pop ) pops the last warning state pushed onto the stack. Any changes that you made to the warning state between push and pop are undone. | False arithmetic overflow warning in MSVC ( Microsoft Visual compiler for C, C++) results in code not compiling | compiler | compiler | overflow warning, Microsoft compiler | disable test/warning | disable warning | Disable warnings for arithmetic overflow raised by MSVC (Microsoft Visual C Compiler). Add logic to ignore warning using warning pragma: pragma warning(disable : 4146) that allows for ignoring specified warning messages. Also push and pop are used. | // Ignore the false warning "Arithmetic overflow" for MSVC + #ifdef _MSC_VER + # pragma warning(push) + # pragma warning(disable : 4146) + #endif + /// Gets the minimum value for a N-bit signed integer. inline int64_t minIntN(int64_t N) { assert(N > 0 && N <= 64 && "integer width out of range"); + return -(UINT64_C(1) << (N - 1)); } + #ifdef _MSC_VER + # pragma warning(pop) + #endif |
||||||
66 | PyTorch | ec8e75ea92ae2b5ea73b4aeb3ec7cb39e9f95db9 | Cuda | Fix | overflow | Histograms are an important data representation with many applications in computer vision, data analytics and medical imaging. Histogram is a popular analytic graphical representation of data distribution resulting from processing a given numerical input data. | Not enough bits to represent the necessary values using an int for nbins. getBin function in Cuda overflows for large bVal and nbins values => (bVal - minvalue) * nbins = inf |
other | other | Cuda histogram | increase variable precision/change variable type | increase variable precision | Patch: increase precision from int (32 bits) to int64 | t = torch.zeros([10], dtype=torch.int32, device='cuda') + # 35488 * 65536 as int32 would cause overflow to negative value + # giving negative bin offset + t[0] = 35488 + counted = t.bincount(minlength=65536) + self.assertEqual(torch.sum(counted), 10) |
||||||
67 | PyTorch | 17c1b2c7159a0218a69e8486eb4212339253353a | Python | Fix | overflow | Saturation arithmetic is a version of arithmetic in which all operations such as addition and multiplication are limited to a fixed range between a minimum and maximum value. If the result of an operation is greater than the maximum, it is set ("clamped") to the maximum; if it is below the minimum, it is clamped to the minimum. The name comes from how the value becomes "saturated" once it reaches the extreme values; further additions to a maximum or subtractions from a minimum will not change the result. |
In quantization code, range (i.e.: the min and max values of fixed range), which is used as fallback onto default 8-bit qmin and qmax calculation if dynamic range is not used, can cause overflow | quantization | quantization | quantization, range | rewrite math formula | rewrite math formula | Change range: relax scale and zero-point for activations to ensure that fbgemm implementations of conv and linear do not saturate due to 16 bit intermediate accumulation. But now in Pytorch: "Please use quant_min and quant_max to specify the range for observers. reduce_range will be deprecated in a future release of PyTorch." | if self.dtype == torch.qint8: - qmin, qmax = -128, 127 else: - qmin, qmax = 0, 255 |
@@ -59,9 +61,15 @@ class ObserverBase(ABC, nn.Module): ) if self.dtype == torch.qint8: + if self.reduce_range: + qmin, qmax = -64, 63 + else: + qmin, qmax = -128, 127 else: + if self.reduce_range: + qmin, qmax = 0, 127 + else: + qmin, qmax = 0, 255 |
class ObserverTest(QuantizationTestCase): @given(qdtype=st.sampled_from((torch.qint8, torch.quint8)), - qscheme=st.sampled_from((torch.per_tensor_affine, torch.per_tensor_symmetric))) - def test_minmax_observer(self, qdtype, qscheme): - myobs = MinMaxObserver(dtype=qdtype, qscheme=qscheme) + qscheme=st.sampled_from((torch.per_tensor_affine, torch.per_tensor_symmetric)), + reduce_range=st.booleans()) + def test_minmax_observer(self, qdtype, qscheme, reduce_range): + # reduce_range cannot be true for symmetric quantization with uint8 + if qdtype == torch.quint8 and qscheme == torch.per_tensor_symmetric: + reduce_range = False + myobs = MinMaxObserver(dtype=qdtype, qscheme=qscheme, reduce_range=reduce_range) |
||||
68 | PyTorch | c845984271a551ac1c61b9eb06a17fb57aafbd7e | Cuda | Fix | overflow | A loop from i to n uses int to store the index i, which overflows after it is incremented. Overflow makes the index negative, which will also cause buffer overflow | other | other | looping, Cuda | increase variable precision/change variable type | increase variable precision | increase precision from int to int 64 | #define CUDA_KERNEL_LOOP(i, n) \ for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); i += blockDim.x * gridDim.x) |
// CUDA: grid stride looping +// int64_t _i_n_d_e_x specifically prevents overflow in the loop increment. +// If input.numel() < INT_MAX, _i_n_d_e_x < INT_MAX, except after the final +// iteration of the loop where _i_n_d_e_x += blockDim.x * gridDim.x can be +// greater than INT_MAX. But in that case _i_n_d_e_x >= n, so there are no +// further iterations and the overflowed value in i=_i_n_d_e_x is not used. #define CUDA_KERNEL_LOOP(i, n) \ + int64_t _i_n_d_e_x = blockIdx.x * blockDim.x + threadIdx.x; \ + for (int i=_i_n_d_e_x; _i_n_d_e_x < (n); _i_n_d_e_x+=blockDim.x * gridDim.x, i=_i_n_d_e_x) |
@unittest.skipIf(not TEST_MEDIUM_TENSOR, "not enough memory") + def test_cuda_kernel_loop_overflow(self): + # Issue #24309: In extreme cases, the loop variable could overflow and continue + # the kernel loop with a negative index, causing a RuntimeError (invalid write): + x = torch.randn(1, 1, 1, 2**30 + 1, dtype=torch.float16, device="cuda") + expected = x[0, 0, 0, 2**30] + y = torch.nn.functional.avg_pool2d(x, kernel_size=1) + torch.cuda.synchronize() + self.assertEqual(y[0, 0, 0, 2**30], expected) + + @unittest.skipIf(not TEST_LARGE_TENSOR, "not enough memory") + def test_cuda_kernel_loop_overflow_large(self): + # Make sure input.numel() > INT_MAX is handled: + x = torch.randn(1, 1, 1, 2**31, dtype=torch.float16, device="cuda") + with self.assertRaisesRegex(RuntimeError, "integer out of range"): + y = torch.nn.functional.avg_pool2d(x, kernel_size=1) + + # Issue #24309: In extreme cases, the loop variable could overflow and continue + # the kernel loop with a negative index, causing a RuntimeError (invalid write): + x = torch.randn(1, 1, 1, 2**31 - 1, dtype=torch.float16, device="cuda") + expected = x[0, 0, 0, 2**31 - 2] + y = torch.nn.functional.avg_pool2d(x, kernel_size=1) + torch.cuda.synchronize() + self.assertEqual(y[0, 0, 0, 2**31 - 2], expected) |
|||||
69 | PyTorch | 4d2bf0b51b71f96929b58c6e23fb71d3e25440ff | Python | Unit test | loss of precision | backward pass output in quantization aware training was not accurate enough | quantization | quantization | quantization aware training, testing precision, backward pass | increase variable precision/change variable type | increase variable precision | Increase precision from float to double | ||||||||
70 | PyTorch | af908d57ea07c593bb7c8db00c3139fc973b2d4c | Python | Unit test | loss of precision | Test for quantized operations’s precision is failing in function def test_adaptive_avg_pool2d(self, X, output_size_h, output_size_w) due to double rounding | quantization | quantization | quantization, precision testing, average pooling | relax accuracy test tolerance | relax accuracy test tolerance | Increase unittest precision tolerance to 1.0 to avoid failing | - self.assertEqual(X_ref, qX_repr, - message=error_message.format(name, X_ref, qX_repr)) |
self.assertEqual(X_ref, qX_hat.int_repr(), prec=1.0, message=error_message.format(name, X_ref, qX_hat)) |
||||||
71 | PyTorch | 83bfd76b2f7a9b388537eb00022622d9c6989890 | Python | Unit test | loss of precision | absolute tolerance (atol). An absolute tolerance is a fixed number that is used to make direct comparisons | Test in function make_input(batch_size) in class ONNX Runtime (ONNX=Open Neural Network Exchange) fails AssertionError: Not equal to tolerance rtol=0.001, atol=1e-07 |
other | other | GRU (Gated Recurrent Unit in RNN) | relax accuracy test tolerance | relax accuracy test tolerance | relax precision tolerance, absolute tolerance (atol) = 1e-5 | self.run_test(model, input, batch_size=RNN_BATCH_SIZE,) | self.run_test(model, input, batch_size=RNN_BATCH_SIZE, atol=1e-5) | |||||
72 | PyTorch | 77651615c8976b6ad7ddd8abf2a62cd54b573f56 | C++ | Fix | loss of precision | CHAR_BIT indicates how many bits are in a char. On almost every architecture today it's 8 bits to a char, but on some historical machines it has been 7. | The previous code mistook the number of decimal digits with the binary precision that this gemm implementation expected | quantization | quantization | quantization, fbgemm | increase variable precision/change variable type | increase variable precision | Use the correct number of binary precision. Interestingly, while C expects everything in terms of number of bytes, this library expects precision to be in number of bits, so CHAR_BIT must be multiplied by the result of sizeof (sizeof returns number of bytes) in order to get this number in bits. | qparams.precision = std::numeric_limits<typename T::underlying>::digits; | qparams.precision = CHAR_BIT * sizeof(typename T::underlying); | |||||
73 | PyTorch | 9b69f21a95fa626522ef371f8557e7286f9db318 | C++ | Fix | loss of precision | The Code Generator (codegen.h/cpp) produces the string to be compiled on the device. Csrc directory in Pytorch repo contains all of the code concerned with integration with Python. This is in contrast to lib, which contains the Torch libraries that are Python agnostic. csrc depends on lib, but not vice versa. Jit directory contains (most of) the C++ code for the PyTorch JIT, a language and compiler stack for executing PyTorch models portably and efficiently. The fuser accepts subgraphs wrapped in "fusion nodes" and tries to execute them by just-in-time (JIT) compiling kernels that run all the graph operations. fuser - identify processes using files or sockets just-in-time (JIT) compilation (also dynamic translation or run-time compilations)[1] is a way of executing computer code that involves compilation during execution of a program – at run time – rather than before execution.[ Std::scientific modifies the default formatting for floating-point input/output. Specifically, write floating-point values in scientific notation Sets the floatfield format flag for the str stream to scientific. When floatfield is set to scientific, floating-point values are written using scientific notation: the value is represented always with only one digit before the decimal point, followed by the decimal point and as many decimal digits as the precision field (precision). Finally, this notation always includes an exponential part consisting on the letter e followed by an optional sign and three exponential digits. Std::setprecision When used in an expression out << setprecision(n) or in >> setprecision(n), sets the precision parameter of the stream out or in to exactly n. |
low precision emitted for prim:: Constant | compiler | compiler | code generation for compiler, fuser, JIT | increase variable precision/change variable type | increase variable precision | Patch 1: Emit higher precision literal for float values v in the fusion kernel using std::setprecision instead of std::scietific. Patch 2: increase precision in code that sets variable types: int to int 64 and float to double |
// Note: The NAN, NEG_INFINITY and POS_INFINITY strings map to device-specific // implementations of these special values. These macros are found in the // resource strings for each device. static std::string scalarValue(const double v) { std::ostringstream out; if (std::isnan(v)) { out << "NAN"; } else if (std::isinf(v)) { if (v < 0) { out << "NEG_INFINITY"; } else { out << "POS_INFINITY"; } } else { out << std::setprecision(16) << v; } return out.str(); } |
@unittest.skipIf(RUN_CUDA, 'This tests the CPU fuser') + @unittest.skipIf(IS_WINDOWS or IS_SANDCASTLE, "NYI: fuser support for Windows or Sandcastle") + @enable_cpu_fuser + def test_fuser_double_literal_precision(self): + code = ''' + graph(%2 : Float(*, *)): + %4 : int = prim::Constant[value=1]() + %3 : float = prim::Constant[value=1.282549830161864]() + %5 : Float(*, *) = aten::add(%2, %3, %4) + %1 : Float(*, *) = aten::relu(%5) + return (%1) + ''' + + graph = parse_ir(code) + code = torch._C._jit_fuser_get_fused_kernel_code(graph, [torch.rand(3, 4)]) + FileCheck().check('1.282549830161864').run(code) |
|||||
74 | PyTorch | 8e1e29124de99c01d08a2e2c02455c72335a971d | Python | Fix | loss of precision | In various distributions (bernoulli, binomial, etc.) the expansion method chooses to use probabilities over logits, which results in loss of precision | statistical distributions | statistical distributions | distributions | rewrite math formula | rewrite math formula | In method “expand(self, batch_shape, _instance=None)” of the distribution class change logic of preference of probabilities and logits If logits are available, use them over probabilities (not the other way around) |
||||||||
75 | PyTorch | 2ed95c58713b45a6a9dac4336135523555bc58a9 | C++ | Disable warning | overflow | error from Microsoft compiler when building | compiler | compiler | Micsrosoft C++ compiler, Converter | disable test/warning | disable warning | disable warning using pragma warning disable | #ifdef _MSC_VER +#pragma warning( disable : 4146 ) +#endif |
|||||||
76 | PyTorch | dc72a5e02c1ecb105ea58cafcf10ef3a6f7d9c25 | C++ | Fix | underflow | CV refers to OpenCV and rotatedRectangleIntersection is a function in OpenCV library rotatedRectangleIntersection finds out if there is any intersection between two rotated rectangles. int cv::rotatedRectangleIntersection ( const RotatedRect & rect1, const RotatedRect & rect2, OutputArray intersectingRegion ) |
cv::rotatedRectangleIntersection has a known float underflow bug that would cause failure in ```CV_Assert(intersection.size() <= 8)```, Problem reported in OpenCV |
data processing | image processing | OpenCV, rotated triangele intersection | use a different algorithm | use a different algorithm | Replace rotatedRectangleIntersection with custom made replacement function cvfix_rotatedRectangleIntersection. When OpenCV version is upgraded to be >= 4.0, we can remove this replacement function. |
|||||||
77 | PyTorch | 4b97a4642100e26d14c34c07c31643422d60ac48 | C++ | Disable warning | overflow | compilation error due to signed overflow | compiler | compiler | compiling | disable test/warning | disable warning | Disable strict-overflow flag to avoid compilation error | ADD_COMPILE_OPTIONS(-Wno-strict-overflow) ADD_COMPILE_OPTIONS(-Wno-error=strict-overflow) |
|||||||
78 | PyTorch | 55b25365e9e11ee4d9dfb02ff1c79081225c7bd1 | C++ | New feature | loss of precision | N/A | N/A | non-standard precision | non-standard precision | quantization, low precision computations | other | add new algorithm | Add feature to allow 8 bit precision values (ultra low precision) | is_same<T, uint8_t>::value && GetCpuId().avx2(); | is_same<T, uint8_t>::value && GetCpuId().avx2() && !FLAGS_caffe2_dnnlowp_force_slow_path; |
|||||
79 | PyTorch | efd2aeac9e03a8813ba37db98e1a7645fa2902be | txt | Disable warning | overflow | Wno-stringop-overflow uses Object Size Checking to determine the sizes of destination objects | stringop-overflow flag is added in only in GCC 7 | compiler | compiler | GCC compiler flags | disable test/warning | disable warning | Change logic for compiler flag Wno-stringop-overflow. Set it only if GCC compiler version >= 7 | if (CMAKE_COMPILER_IS_GNUCXX AND NOT (CMAKE_CXX_COMPILER_VERSION VERSION_LESS 7.0.0)) + set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-stringop-overflow") + endif() |
||||||
80 | PyTorch | d97c9dd01904ff423554345cd877ebc1e520c21e | Python | Add warning | loss of precision | Check gradients computed via small finite differences against analytical gradients w.r.t. tensors in :attr:`inputs` that are of floating point or complex type and with ``requires_grad=True``. The check between numerical and analytical gradients uses :func:`~torch.allclose`. For most of the complex functions we consider for optimization purposes, no notion of Jacobian exists. Instead, gradcheck verifies if the numerical and analytical values of the Wirtinger and Conjugate Wirtinger derivatives are consistent. Because the gradient computation is done under the assumption that the overall function has a real-valued output, we treat functions with complex output in a special way. For these functions, gradcheck is applied to two real-valued functions corresponding to taking the real components of the complex outputs for the first, and taking the imaginary components of the complex outputs for the second. For more details, check out :ref:`complex_autograd-doc`. |
failure of gradient check between numerical and anlytical gradients due to low precision input (the input is numerical gradients) | gradients/derivatives | automatic differentiation | gradients, autograd, testing precision | add test/warning | add precision warning | Add a warning for gradients that require a check between numerical and analytical gradients need to be of double precision | |||||||
81 | PyTorch | 4d287f90743e09d1fdc6e2b3519b16c2d1ae3fa3 | C++ | Fix | overflow | for loop index overflow if input vector is large | tensor math | tensor math | loop index, low level math, summation of scalars | increase variable precision/change variable type | increase variable precision | increase precision from int to int 64 | for (int i = k * WIDTH; i != size; i++) | Patch: increase precision from int to int 64 @@ -102,7 +102,7 @@ struct Reduction { sum = std::accumulate(buf, buf + WIDTH, scalar_t(ident), ReduceScalar()); } + for (int64_t i = k * WIDTH; i != size; i++) { sum = ReduceScalar()(sum, data[i]); } return sum; |
||||||
82 | PyTorch | 7cbe63da8621b6063c864527592db6b1c894804f | Cuda | Fix | loss of precision | statistical distributions | statistical distributions | Distributions (Multinomial), THT Tensor Random, binarySearchForMultinomial |
rewrite math formula | rewrite math formula | // first non-zero element by setting start to size-1 here, + // the code below will move it to the last non-zero probability + // this actually can happen when the random number is 1 |
start = 0; | start = size - 1; | # Test a corner case from older PyTorch (Issue #4858) + freqs = torch.cuda.FloatTensor([ + 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, + 0.03178183361887932, 0.027680952101945877, 0.033176131546497345, + 0.046052902936935425, 0.07742464542388916, 0.11543981730937958, + 0.14148041605949402, 0.15784293413162231, 0.13180233538150787, + 0.08271478116512299, 0.049702685326337814, 0.027557924389839172, + 0.018125897273421288, 0.011851548217236996, 0.010252203792333603, + 0.007422595750540495, 0.005372154992073774, 0.0045109698548913, + 0.0036087757907807827, 0.0035267581697553396, 0.0018864056328311563, + 0.0024605290964245796, 0.0022964938543736935, 0.0018453967059031129, + 0.0010662291897460818, 0.0009842115687206388, 0.00045109697384759784, + 0.0007791675161570311, 0.00020504408166743815, 0.00020504408166743815, + 0.00020504408166743815, 0.00012302644609007984, 0.0, + 0.00012302644609007984, 4.100881778867915e-05, 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0]) + + torch.cuda.manual_seed(11042) + sample = torch.multinomial(freqs, 1000, True) + self.assertNotEqual(freqs[sample].min(), 0) |
||||||
83 | PyTorch | 0167f76d2a99ced5f4359d8ea77eb6704179b797 | Python | Unit test | loss of precision | An absolute tolerance is a fixed number that is used to make direct comparisons. Rtol = relative tolerance |
precision tests/speed benchmarks | accuracy testing | testing precision, external to pytorch, onnx | relax accuracy test tolerance | relax accuracy test tolerance | Relax precision, specifically, add absolute tolerance (atol). | np.testing.assert_allclose( ref_outputs[i], outputs[i], rtol=1e-3) |
np.testing.assert_allclose( ref_outputs[i], outputs[i], rtol=1e-3, atol=1e-7) |
||||||
84 | PyTorch | 4b8f4fc25902e3a325b06e2db415bba9fad7c0ef | Python | New feature | loss of precision | N/A | N/A | non-standard precision | non-standard precision | mixed precision, training | add new precision option | add new precision option | allow mixed precision in distributed training | |||||||
85 | PyTorch | 873f1163806c14ae236538f76c44d04b63bef331 | Python | Unit test | loss of precision | The STFT computes the Fourier transform of short overlapping windows of the input. This giving frequency components of the signal as they change over time. The interface of this function is modeled after the librosa stft function. | STFT (short time fourier transform) precision test not passing | precision tests/speed benchmarks | accuracy testing | testing precision, fourier transform | relax accuracy test tolerance | relax accuracy test tolerance | increase precision tolerance in assertEqual from 5e-6 to 7e-6 | self.assertEqual(result.data, ref_result, 5e-6, 'stft result') | self.assertEqual(result.data, ref_result, 7e-6, 'stft result') | |||||
86 | PyTorch | f9fd82d8933639a8cf20a029c7fa47fff8fdb93d | Cuda | Fix | loss of precision | __half2float = defined in cuda: cuda_fp16.h. Converts half number to float. ScalarConvert = defined cvd library |
non-standard precision | non-standard precision | mixed precision, sigmoid | increase variable precision/change variable type | increase variable precision | Change how to convert half precision variables to float in struct TensorSigmoidOp. Use ScalarConvert instead of __half2float. Change float to accreal, which is int64_t | - float fin = __half2float(*in); - *out = __float2half(1.0f / (1.0f + expf(- fin))); -#define H2F(input) __half2float(input) -#define F2H(input) __float2half(input) |
float fin = ScalarConvert<half, float>::to(*in); + *out = ScalarConvert<float, half>::to(1.0f / (1.0f + expf(- fin))); +#define H2F(input) ScalarConvert<real, accreal>::to(input) +#define F2H(input) ScalarConvert<accreal, real>::to(input) |
||||||
87 | PyTorch | 35abc4efa2d08ef2e9b7d978089fbd98b8d14187 | C++ | New feature | loss of precision | N/A | N/A | torch.digamma(input, *, out=None) → Tensor Computes the logarithmic derivative of the gamma function on input. torch.polygamma(n, input, *, out=None) → Tensor Computes the nth derivative of the digamma function on input. n≥0 is called the order of the polygamma function. |
gradients/derivatives | derivatives | add new precision option | add new precision option | Add low-precision digamma() and polygamma() functions | def test_digamma(self): + def test(use_double=False): + cpu_tensor = torch.randn(10, 10, 10) + gpu_tensor = cpu_tensor.cuda() + zeros = torch.zeros(10, 10, 10) + if (use_double): + cpu_tensor = cpu_tensor.double() + gpu_tensor = gpu_tensor.double() + zeros = zeros.double() + cpu_out = cpu_tensor.digamma() + gpu_out = gpu_tensor.digamma() + norm_errors = (gpu_out - cpu_out.cuda()) / gpu_out + self.assertEqual(norm_errors, zeros) + + test(True) + test(False) + + def test_polygamma(self): + def test(use_double=False): + cpu_tensor = torch.randn(10, 10, 10) + gpu_tensor = cpu_tensor.cuda() + zeros = torch.zeros(10, 10, 10) + if (use_double): + cpu_tensor = cpu_tensor.double() + gpu_tensor = gpu_tensor.double() + zeros = zeros.double() + for n in [0, 1]: + cpu_out = cpu_tensor.polygamma(n) + gpu_out = gpu_tensor.polygamma(n) + norm_errors = (gpu_out - cpu_out.cuda()) / gpu_out + self.assertEqual(norm_errors, zeros) |
log, derivative | |||||
88 | PyTorch | 0443c11f7e4d14dfe5f5b23f4112a4c443d95a9c | Python | Fix | loss of precision | Volta is the codename for a GPU microarchitecture developed by Nvidia The major revision number is 7 for devices based on the Volta architecture, 6 for devices based on the Pascal architecture, 5 for devices based on the Maxwell architecture, 3 for devices based on the Kepler architecture, 2 for devices based on the Fermi architecture, and 1 for devices based on the Tesla architecture. torch.cuda.get_device_capability(device=None) Gets the cuda capability of a device. Return type: tuple(int, int) Returns: the major and minor cuda capability of the device |
pre-volta architecture (i.e.: prior to cuda major version 7) half precision needs special handling | non-standard precision | non-standard precision | GPU, cuDNN, RNN, half precision, hardware | other | check hardware | Fix half precision for older (pre-volta) NVIDIA cards. Add in additional logic for checking major cuda capability of currently selected device - check if major cuda capability is greater than 7. | - if version() >= 7000 and int(cuda[0]) >= 9: - lib.cudnnSetRNNMatrixMathType(self, CUDNN_DEFAULT_MATH) - if datatype == CUDNN_DATA_HALF: - lib.cudnnSetRNNMatrixMathType(self, CUDNN_TENSOR_OP_MATH) |
if version() >= 7000 and int(cuda[0]) >= 9 and ( + torch.cuda.get_device_capability(torch.cuda.current_device())[0] >= 7): + lib.cudnnSetRNNMatrixMathType(self, CUDNN_DEFAULT_MATH) + if datatype == CUDNN_DATA_HALF: + lib.cudnnSetRNNMatrixMathType(self, CUDNN_TENSOR_OP_MATH) |
|||||
89 | PyTorch | f7a459b28b184dedf265ed8718f85cb483e8284e | Cuda | Fix | overflow | MAGMA is a collection of next generation linear algebra (LA) GPU accelerated libraries designed and implemented by the team that developed LAPACK and ScaLAPACK. | overflow when using magma Ints are commonly only 32 bits. An int should not be used for a size. If you index an array of floats with an int, you can only store up to 8 GB of data in that array, much smaller than many workloads require. |
other | external library | extranal library, linear algebra | increase variable precision/change variable type | increase variable precision | Use a 64 bit value for size. | - int n = a_->size[0]; - int nrhs = b_->size[1]; |
int64_t n = a_->size[0]; + int64_t nrhs = b_->size[1]; |
|||||
90 | PyTorch | 30bbeb8b87ef815d8a7ce8ab8af9a301fcedfbe9 | C++ | Change exception to a warning | overflow/loss of precision | exception due to check requiring that type conversions are exact | precision tests/speed benchmarks | accuracy testing | scalars, overflow check, precision check | relax accuracy test tolerance | relax accuracy test tolerance | relax type conversion check - only throw an exception if overflow | ||||||||
91 | PyTorch | a92fce18715be7317b5eab1319140899b355eb9f | Python | Unit test | loss of precision | cpu and gpu gradients are not exact | precision tests/speed benchmarks | accuracy testing | precision testing, testing that cpu and cuda gradients are equal | relax accuracy test tolerance | relax accuracy test tolerance | Relax precision tolerance from 0 to 5e-5 for comparing cpu and gpu gradients | self.assertEqual(grid_cpu.grad, grid_cuda.grad) | self.assertEqual(grid_cpu.grad, grid_cuda.grad, prec=5e-5) | ||||||
92 | PyTorch | 4c35c630eca9a7a3fbfc8f4bc72ea2fd5ba0dd05 | Python | Unit test | loss of precision | precision tests/speed benchmarks | accuracy testing | gradient check | relax accuracy test tolerance | relax accuracy test tolerance | Enable norm gradgradchecks by lowering precision requirements. Add absolute and relative tolerance precision based on empirical observations | gradgradcheck_precision_override = { + 'test_NormFunction_1_5': {'atol': 1e-2, 'rtol': 1e-2}, + 'test_NormFunction_2': {'atol': 1e-2, 'rtol': 1e-2}, + 'test_NormFunction_3': {'atol': 5e-2, 'rtol': 1e-2}, +} + if test_name in gradgradcheck_precision_override: + atol = gradgradcheck_precision_override[test_name]['atol'] + rtol = gradgradcheck_precision_override[test_name]['rtol'] + self.assertTrue(gradgradcheck(apply_fn, input, grad_y, atol=atol, rtol=rtol)) |
||||||||
93 | PyTorch | eaacfc7e25c0500f933b0c68e63f1d947739df90 | Python | Speed optimization | loss of precision | Performance issue in momentum update of SGD due to repeatedly converting gradient from 16 bits to 32 and vice versa. cuda was allocating and freeing chunks of memory frequently because grad was changing sizes | optimizers | optimizers | SGD optimizer, momentum update | increase variable precision/change variable type | increase variable precision | Set grad to a predetermined size of fp_32 so cuda no longer needs to alloc/free this frequently. | net.MomentumSGDUpdate( [grad_fp32, momentum_data, lr, param_fp32], - [grad, momentum_data, param_fp32], |
net.MomentumSGDUpdate( [grad_fp32, momentum_data, lr, param_fp32], param_fp32], + [grad_fp32, momentum_data, param_fp32], |
||||||
94 | PyTorch | 1f4317be3f02d84e93303193e782c1cb002b26e3 | Cuda, C++ | New feature | loss of precision | N/A | N/A | ollectives = collective communication in distributed computing Gloo backend for distributed CPU training |
non-standard precision | non-standard precision | Distributed training, half precision | add new precision option | add new precision option | Add support for collectives over vectors of half-precision floating point values | ||||||
95 | PyTorch | aec182ae72d51dad0f46cdfe7ff9a41380d7da35 | Cuda | New feature | loss of precision | N/A | N/A | Performs a batch matrix-matrix product of matrices in batch1 and batch2. input is added to the final result. batch1 and batch2 must be 3-D tensors each containing the same number of matrices. |
linear algebra | linear algebra | tensor math, linear algebra | add new precision option | add new precision option | add support for half precision in tensormath blas in BADDBMM (batch matrix-matrix product) | ||||||
96 | PyTorch | 1bf7bc9768fa3f768419884595e08b3bc25913ea | Cuda | Fix | overflow | One often wants the type for the accumulator to be of higher precision than the inputs. When accumulating (summation for example), error will build up more. | sum accumulator had insufficient precision | statistical distributions | statistical distributions | distributions (multinomial) | increase variable precision/change variable type | increase variable precision | change data type for accumulator from T to AccT and add assertion to make sure the sum of distribution did not overflow (i.e.: is not inf) | assert(!isinf(sum)); | ||||||
97 | PyTorch | c1ba0fbab3ad3f1a4b2630de9629c4749469eada | C++ | New feature | loss of precision | N/A | N/A | non-standard precision | non-standard precision | cuDNN, ReLu, mixed precision | add new precision option | add new precision option | Decide at runtime which precision of types to use | |||||||
98 | PyTorch | 26516f667e688ed38c8ded71af8e1abc3a56d5ee | Python | Unit test | loss of precision | tensor math | statistics | testing precision, mean, standard deviation | relax accuracy test tolerance | relax accuracy test tolerance | relax precision tolerance in assertEqual | - self.assertEqual(r[:,:50].std(), 4, 0.2) - self.assertEqual(r[:,:50].std(), 4, 0.2) - self.assertEqual(q.mean(), 2, 0.1) - self.assertEqual(q.std(), 3, 0.1) - self.assertEqual(q.mean(), 0, 0.1) - self.assertEqual(q.std(), 1, 0.1) |
self.assertEqual(r[:,:50].std(), 4, 0.3) + self.assertEqual(r[:,:50].std(), 4, 0.3) + self.assertEqual(q.mean(), 2, 0.3) + self.assertEqual(q.std(), 3, 0.3) + self.assertEqual(q.mean(), 0, 0.2) + self.assertEqual(q.std(), 1, 0.2) |
|||||||
99 | PyTorch | cd780eb9ec20827a924c658b5960be452797076d | C++ | Speed optimization | inefficient algorithm | AXPBY Scales two vectors, adds them to one another and stores result in the vector. In this case the type is double (daxpby) ?axpby perform vector vector operation defined as y:= a*x + b*y, where a and b are scalars and x and y are vectors of length n |
In caffe2 CPU math using MKL (MKL is an optimized Intel math library) function CAFFE2_SPECIALIZED_AXPBY(double, d) suffers from underlow. When running caffe2 experiments that calling Exp with many values close to 0 causes MKL's underflow error handler to be called repeatedly, causing significant overhead while the result is correct (e.g. exp(x) = 0). |
other | external library | extermal library (MKL), exp, caffe2, | disable test/warning | disable warning | Disable MKL's underflow checker to speed up operation by setting the error mode to VML_ERRMODE_IGNORE | -#define DELEGATE_SIMPLE_UNARY_FUNCTION(T, Funcname, OriginalFunc) \ -template <> \ -void Funcname<T, CPUContext>( \ - const int N, const T* x, T* y, \ - CPUContext* context) { \ - OriginalFunc(N, x, y); \ -} -DELEGATE_SIMPLE_UNARY_FUNCTION(float, Exp, vsExp) -DELEGATE_SIMPLE_UNARY_FUNCTION(double, Exp, vdExp) |
#define DELEGATE_SIMPLE_UNARY_FUNCTION(T, Funcname, OriginalFunc, ...) \ + template <> \ + void Funcname<T, CPUContext>( \ + const int N, const T* x, T* y, CPUContext* context) { \ + OriginalFunc(N, x, y, ##__VA_ARGS__); \ + } +DELEGATE_SIMPLE_UNARY_FUNCTION( + float, + Exp, + vmsExp, + VML_HA | VML_FTZDAZ_OFF | VML_ERRMODE_IGNORE) +DELEGATE_SIMPLE_UNARY_FUNCTION( + double, + Exp, + vmdExp, + VML_HA | VML_FTZDAZ_OFF | VML_ERRMODE_IGNORE) |
|||||
100 | PyTorch | 206029bc5a3f179abe97986641ed3ccd3c414126 | C++ | Fix | overflow | Integer literals are of type int, size index variable overflows if input tensor very big, specifically when input > 2GB). | other | external library | external library, caffe2 | increase variable precision/change variable type | increase variable precision | Increase precision of variable that holds tensor size from int 32 to int 64. Instead of passing in an integer literal, do a static cast on an integer literal to a larger datatype for the accumulator type | auto newSize = std::accumulate( - newDims.begin(), newDims.end(), 1, std::multiplies<TIndex>()); @@ -180,7 +183,10 @@ class Tensor { template <class T, class ContextForCopy> void Reserve(const std::vector<T>& newCapacity, ContextForCopy* context) { auto newSize = std::accumulate( - newCapacity.begin(), newCapacity.end(), 1, std::multiplies<TIndex>()); if (newSize * meta_.itemsize() <= capacity_) { return; } @@ -208,7 +214,10 @@ class Tensor { "New outer dimension must be smaller than current."); dims_[0] = outer_dim; size_ = std::accumulate( - dims_.begin(), dims_.end(), 1, std::multiplies<TIndex>()); } |
auto newSize = std::accumulate( + newDims.begin(), + newDims.end(), + static_cast<TIndex>(1), + std::multiplies<TIndex>()); if (newSize * meta_.itemsize() <= capacity_) { dims_ = newDims; size_ = newSize; @@ -180,7 +183,10 @@ class Tensor { template <class T, class ContextForCopy> void Reserve(const std::vector<T>& newCapacity, ContextForCopy* context) { auto newSize = std::accumulate( + newCapacity.begin(), + newCapacity.end(), + static_cast<TIndex>(1), + std::multiplies<TIndex>()); if (newSize * meta_.itemsize() <= capacity_) { return; } @@ -208,7 +214,10 @@ class Tensor { "New outer dimension must be smaller than current."); dims_[0] = outer_dim; size_ = std::accumulate( + dims_.begin(), + dims_.end(), + static_cast<TIndex>(1), + std::multiplies<TIndex>()); } |
||||||
101 | PyTorch | 5030d76acfcdd48492e988e3fc1aa19bebe9366a | Python | Fix | loss of precision | linear algebra | linear algebra | precision testing for CUDA blas | relax accuracy test tolerance | relax accuracy test tolerance | reduce precision of CUDA blas tests | custom_precision = { 'addbmm': 1e-4, 'addmm': 1e-4, + 'addmv': 1e-4, + 'addr': 1e-4, + 'baddbmm': 1e-4, 'rsqrt': 1e-4, 'cumprod': 1e-4, } |
||||||||
102 | PyTorch | a489884da4b63e33ede107261afd6a4a81d9401a | Python | Unit test | loss of precision | torch.addmm(input, mat1, mat2, *, beta=1, alpha=1, out=None) → Tensor Performs a matrix multiplication of the matrices mat1 and mat2. The matrix input is added to the final result. alpha and beta are scaling factors on matrix-vector product between mat1 and mat2 and the added matrix input respectively. |
linear algebra | linear algebra | precision testing for matrix multiply | relax accuracy test tolerance | relax accuracy test tolerance | Reduce precision of addmm CUDA test | custom_precision = { 'addbmm': 1e-4, + 'addmm': 1e-4, 'rsqrt': 1e-4, 'cumprod': 1e-4, } |
out = Beta * input + Alpha * (mat1_i @ mat2_i) | ||||||
103 | PyTorch | a0fb1ab86e88d5c98733d7e6e5aa3b5811fe24f4 | Python | Unit test | loss of precision | torch.rsqrt(input, *, out=None) → Tensor Returns a new tensor with the reciprocal of the square-root of each of the elements of input. | linear algebra | linear algebra | precision testing for matrix multiply and square root | relax accuracy test tolerance | relax accuracy test tolerance | Reduce precision for addmm and rsqrt CUDA tests | out_i = 1/(sqrt(input_i)) | |||||||
104 | PyTorch | f7fe6cf1a6a58c55335e1b337dbdd23a78a2f74a | C | Fix | overflow | statistical distributions | statistical distributions | multinomial distribution | increase variable precision/change variable type | increase variable precision | Using higher precision type for accumulator | void THTensor_(multinomial)(THLongTensor *self, THGenerator *_generator, THTenso for (i=0; i<n_dist; i++) { /* Get normalized cumulative distribution from prob distribution */ - real sum = 0; for (j=0; j<n_categories; j++) { sum += THStorage_(get)( \ @@ -160,7 +160,7 @@ void THTensor_(multinomial)(THLongTensor *self, THGenerator *_generator, THTenso /* update cumulative distribution so that sample cannot be drawn again */ real diff; real new_val = 0; - real sum; if (sample_idx != 0) { |
void THTensor_(multinomial)(THLongTensor *self, THGenerator *_generator, THTenso for (i=0; i<n_dist; i++) { /* Get normalized cumulative distribution from prob distribution */ + accreal sum = 0; for (j=0; j<n_categories; j++) { sum += THStorage_(get)( \ @@ -160,7 +160,7 @@ void THTensor_(multinomial)(THLongTensor *self, THGenerator *_generator, THTenso /* update cumulative distribution so that sample cannot be drawn again */ real diff; real new_val = 0; + accreal sum; if (sample_idx != 0) { |
|||||||
105 | Tensorflow/Keras | 2ccbbdb4b06bf0d60d02c7cf316fce117b77df55 | C++ | fix | overflow/underflow | softmax output is NaN | overflow/underflow | Direct calculation of the softmax function according to its definition formula is conjugate with numerical issues. Single-precision exp(x) function overflows for x > 89 and underflows for x < −104, and, in turn, cause NaN outputs in the na¨ıve implementations. |
activation functions | activation functions | softmax, openGL | use a different algorithm | use a different algorithm | Implement a tree pass softmax algorithm, see algorithm in https://arxiv.org/pdf/2001.04438.pdf | softmax | |||||
106 | Tensorflow/Keras | 115623e2fc21affeaeee5167daec9c1f0db27069 | C++ | fix | overflow/underflow | softmax output is NaN | overflow/underflow | Direct calculation of the softmax function according to its definition formula is conjugate with numerical issues. Single-precision exp(x) function overflows for x > 89 and underflows for x < −104, and, in turn, cause NaN outputs in the na¨ıve implementations. |
activation functions | activation functions | softmax, openCL | use a different algorithm | use a different algorithm | Implement a tree pass softmax algorithm, see algorithm in https://arxiv.org/pdf/2001.04438.pdf | softmax | |||||
107 | Tensorflow/Keras | e665a737f90564cd143fdc1b15420720596d17e1 | C++ | fix | underflow | tensor math | statistics | mean test | rewrite math formula | rewrite math formula | auto input_rng = std::bind( - std::uniform_real_distribution<float>(-15.0f, 15.0f), std::ref(rng)); |
auto input_rng = + std::bind(std::uniform_real_distribution<float>(), std::ref(rng)); |
||||||||
108 | Tensorflow/Keras | e60c1ba960e598be9c0e0cdd331cdc10e8919dbb | C++ | fix | overflow/underflow | activation functions | activation functions | LSTM, logistic function | rewrite math formula | rewrite math formula | XlaOp Logistic(XlaOp x) { - auto half = xla::ScalarLike(x, 0.5); - return half + half * xla::Tanh(half * x); } |
XlaOp Logistic(XlaOp x) { + auto one = xla::ScalarLike(x, 1); + return xla::Div(one, (one + xla::Exp(xla::Neg(x)))); } |
def testFloatOpsDisabledOnMlirBridge(self): + for dtype in self.float_types: + if dtype != np.float16: + self._assertOpOutputMatchesExpected( + lambda x: math_ops.sigmoid(x) / math_ops.log1p(math_ops.exp(x)), + np.array([-40, 40], dtype=dtype), + expected=np.array([1.0, 0.025], dtype=dtype)) |
|||||||
109 | Tensorflow/Keras | 86fa42f516e4c5ca5ac3b2430aeab9d1a55afb30 | python | fix | loss of precision | the output of derivative of betainc is NaN | invalid operation | I = betainc(X,Z,W) computes the incomplete beta function for corresponding elements of the arrays X, Z and W. The elements of X must be in the closed interval . The arrays Z and W must be nonnegative and real. All arrays must be the same size, or any of them can be scalar. | When calculating the derivate of betainc, if a or b are equal to 1, there is a risk that log(0) occurs | gradients/derivatives | derivatives | derivative of Betainc (incomplete beta function) | rewrite math formula | rewrite math formula | Use xlog1py and xlogy instead of log. The function xlog1py computes x * log1p(y) for a given x and y, This function safely returns zero when x = 0, no matter what the value of y is. The function xlogy(x,y) returns 0 if x == 0, and x * log(y) otherwise, elementwise. | partial_x = math_ops.exp((b - 1) * math_ops.log(1 - x) + - (a - 1) * math_ops.log(x) - log_beta) |
# We use xlog1py and xlogy since the derivatives should tend to + # zero one one of the tails when a is 1. or b is 1. + partial_x = math_ops.exp(math_ops.xlog1py(b - 1, -x) + + math_ops.xlogy(a - 1, x) - log_beta) |
exp, log | ||
110 | Tensorflow/Keras | ee85e6d230278e763a2784ba86acc747abdb2242 | C++ | fix | loss of precision | decreased accuracy | tensor math | statistics | variance | use a different algorithm | use a different algorithm | Use more numerically stable two-pass algorithm to calculate variance in MeanStddevNormalization. | for (int batch = 0; batch < n_batch; ++batch) { float sum = 0.0f; - float sum_sq = 0.0f; for (int i = 0; i < v_size; ++i) { sum += input_vector[i]; - sum_sq += input_vector[i] * input_vector[i]; } const float mean = sum / v_size; - const float variance = sum_sq / v_size - mean * mean; |
for (int batch = 0; batch < n_batch; ++batch) { float sum = 0.0f; } const float mean = sum / v_size; - const float variance = sum_sq / v_size - mean * mean; + float sum_diff_sq = 0.0f; + for (int i = 0; i < v_size; ++i) { + const float diff = input_vector[i] - mean; + sum_diff_sq += diff * diff; + } + const float variance = sum_diff_sq / v_size; |
variance, sum of squares | |||||
111 | Tensorflow/Keras | fd2d8bc50e9b3143544819bf505326e4ed6db2a5 | C++ | fix | overflow/underflow | incorrect result | overflow | XlaOp=Array to concatenate across replicas. asinh(x) = log(x + sqrt(x^2 + 1)) | risk of overflow due to x^2 for large x | tensor math | tensor math | inverse hyperbolc sine | rewrite math formula | rewrite math formula | For positive x, we can approximate x + sqrt(x^2 + 1) as 2*x and return log(2) + log(x). For negative x we utilize asinh(-x) = -asinh(x) | XlaOp Asinh(XlaOp x) { return Log(x + Sqrt(x * x + ScalarLike(x, 1.0))); } | XlaOp Asinh(XlaOp x) { + XlaBuilder* b = x.builder(); + auto do_it = [&](XlaOp x) -> StatusOr<XlaOp> { + TF_ASSIGN_OR_RETURN(auto shape, b->GetShape(x)); + auto one = ScalarLike(x, 1); + if (primitive_util::IsComplexType(shape.element_type())) { + return Log(x + Sqrt(x * x + one)); + } + auto a = Abs(x); + auto naive_result = Log(a + Sqrt(a * a + one)); + auto overflow_result = Log(Abs(a)) + Log(ScalarLike(a, 2)); + auto sqrt_max_value = Sqrt(MaxFiniteValue(b, shape.element_type())); + return Sign(x) * + Select(Ge(a, sqrt_max_value), overflow_result, naive_result); + }; + // These upcasts are not strictly necessary on all platforms to get within our + // error tolerances, so we could relax this if it ever mattered. + return DoWithUpcastToF32(x, {BF16, F16}, [&](XlaOp x) { + return b->ReportErrorOrReturn(do_it(x)); + }); +} |
log, square root, power | ||
112 | Tensorflow/Keras | f84e8257aa88fa45cc7a15835ad386565cd60237 | C++ | fix | loss of precision | In Eigen, a reduction is a function taking a matrix or array, and returning a single scalar value. One of the most used reductions is .sum() , returning the sum of all the coefficients inside a given matrix or array. | CNN operations | pooling layer | eigen reduction, summation, EigenPooling | use a different algorithm | use a different algorithm | use a tree algorithm for summation | summation | |||||||
113 | Tensorflow/Keras | 18f860fd8e1fdffd80633cf5ac32f895423dfa8d | C++ | fix | underflow/loss of precision | other | random number generator | testing, random number generation | rewrite math formula | rewrite math formula | change input range for random number generator | std::uniform_real_distribution<FloatT> generator(-0.9f, 1.0f); | std::uniform_real_distribution<FloatT> generator(1.0f, 1.125f); | |||||||
114 | Tensorflow/Keras | 35ca57d39b9e368ef43302421db774e4ac3e3625 | Python | fix | overflow/underflow | overflow/underflow | statistical distributions | statistical distributions | binomial distribution | rewrite math formula | rewrite math formula | Use log_sigmoid instead of log1p and log. ALso, use logits instead of probabilities | return (self.total_count * math_ops.log1p(-self.probs) - + x * math_ops.log(self.probs)) |
return (self.total_count * math_ops.log_sigmoid(-self.logits) + + x * math_ops.log_sigmoid(self.logits)) |
def testLogProbOverflow(self): + with self.test_session() as sess: + logits = np.float32([20., 30., 40.]) + total_count = np.float32(1.) + x = np.float32(0.) + nb = negative_binomial.NegativeBinomial( + total_count=total_count, logits=logits) + log_prob_ = sess.run(nb.log_prob(x)) + self.assertAllEqual(np.ones_like(log_prob_, dtype=np.bool), + np.isfinite(log_prob_)) + + def testLogProbUnderflow(self): + with self.test_session() as sess: + logits = np.float32([-90, -100, -110]) + total_count = np.float32(1.) + x = np.float32(0.) + nb = negative_binomial.NegativeBinomial( + total_count=total_count, logits=logits) + log_prob_ = sess.run(nb.log_prob(x)) + self.assertAllEqual(np.ones_like(log_prob_, dtype=np.bool), + np.isfinite(log_prob_)) |
log | ||||
115 | Tensorflow/Keras | 2114fd51e9e4fe3cefc058fe42363f68126a9da6 | C++ | fix | overflow/underflow | overflow/underflow | softplus(x) = log(exp(x) + 1), softplus is a smooth approximation of relu. Like relu, softplus always takes on positive values. |
activation functions | activation functions | sofplus | rewrite math formula | rewrite math formula | XLAJIT_MAKE_UNARY(Softplus, - b->Log(b->Add(b->Exp(x), XlaHelpers::One(b, input_type(0))))); |
static xla::ComputationDataHandle Softplus( + xla::ComputationBuilder* b, DataType dtype, + const xla::ComputationDataHandle& features) { + xla::ComputationDataHandle threshold = + b->Add(b->Log(XlaHelpers::Epsilon(b, dtype)), + XlaHelpers::FloatLiteral(b, dtype, 2.0)); + // Value above which exp(x) may overflow, but softplus(x) == x + // is within machine epsilon. + xla::ComputationDataHandle too_large = b->Gt(features, b->Neg(threshold)); + // Value below which exp(x) may underflow, but softplus(x) == exp(x) + // is within machine epsilon. + xla::ComputationDataHandle too_small = b->Lt(features, threshold); + xla::ComputationDataHandle features_exp = b->Exp(features); + xla::ComputationDataHandle output = b->Select( + too_large, features, + b->Select(too_small, features_exp, + b->Log(b->Add(features_exp, XlaHelpers::One(b, dtype))))); + return output; +} +XLAJIT_MAKE_UNARY(Softplus, Softplus(b, input_type(0), x)); |
def _assertSoftplusMatchesExpected(self, features, dtype): + features = np.array(features, dtype=dtype) + zero = np.asarray(0).astype(dtype) + expected = np.logaddexp(zero, features) + self._assertOpOutputMatchesExpected( + nn_ops.softplus, features, expected=expected) + + def testSoftplus(self): + for dtype in self.float_types: + self._assertSoftplusMatchesExpected([[-2, 0, 8]], dtype) + self._assertSoftplusMatchesExpected( + [[-9, 7, -5, 3, -1], [1, -3, 5, -7, 9]], dtype) + log_eps = np.log(np.finfo(dtype).eps) + one = dtype(1) + ten = dtype(10) + self._assertSoftplusMatchesExpected([ + log_eps, log_eps - one, log_eps + one, log_eps - ten, + log_eps + ten, -log_eps, -log_eps - one, -log_eps + one, + -log_eps - ten, -log_eps + ten], dtype) |
|||||
116 | Tensorflow/Keras | 448de13b1ae2ebc96a49785cee5ae98db1ae7b06 | C++ | fix | overflow/underflow | linear algebra | determinant | log determinant of a matrix | use a different algorithm | use a different algorithm | Compute the log determinant through a Partially Pivoted LU decomposition | |||||||||
117 | Tensorflow/Keras | 1193b39c9e58545ac35aae19dfa34a06bdfae073 | Python | fix | underflow | Poisson is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event. Lambda can be 0 through infinity. | for a small rate lamdda in poisson distribution, e^(-lambda) causes numerical stability issues, because exp of a very small number produces exponentially smaller number, which leads to a risk of underflow | statistical distributions | statistical distributions | poisson distribution | rewrite math formula | rewrite math formula | Use log of rate instead of plain rate to avoid exponenitating very small numbers | with ops.control_dependencies([check_ops.assert_positive(rate)] if - validate_args else []): - self._rate = array_ops.identity(rate, name="rate") |
if (rate is None) == (log_rate is None): + raise ValueError("Must specify exactly one of `rate` and `log_rate`.") + elif log_rate is None: + rate = ops.convert_to_tensor(rate, name="rate") + if not rate.dtype.is_floating: + raise TypeError("rate.dtype ({}) is a not a float-type.".format( + rate.dtype.name)) + with ops.control_dependencies([check_ops.assert_positive(rate)] if + validate_args else []): + self._rate = array_ops.identity(rate, name="rate") + self._log_rate = math_ops.log(rate, name="log_rate") + else: + log_rate = ops.convert_to_tensor(log_rate, name="log_rate") + if not log_rate.dtype.is_floating: + raise TypeError("log_rate.dtype ({}) is a not a float-type.".format( + log_rate.dtype.name)) + self._rate = math_ops.exp(log_rate, name="rate") + self._log_rate = ops.convert_to_tensor(log_rate, name="log_rate") class PoissonLogRateTest(PoissonTest): + + def _make_poisson(self, rate, validate_args=False): + return poisson_lib.Poisson( + log_rate=math_ops.log(rate), + validate_args=validate_args) + + def testInvalidLam(self): + # No need to worry about the non-negativity of `rate` when using the + # `log_rate` parameterization. + pass |
exp | ||||
118 | Tensorflow/Keras | 0cff60ebb29f5aba5092988c8b7f13c258115e81 | Python | fix | overflow/underflow | linear algebra | linear algebra | log of hermitian matrix determinant | use a different algorithm | use a different algorithm | Use the property that the log det(A) = 2*sum(log(real(diag(C)))), where C is the cholesky decomposition of A. Add a function to compute the natural log of the determinant for hermitian positive definite matrices in a numerically stable way via Cholesky decompositions.. Equivalent to numpy.linalg.slogdet, although no sign is returned since only + hermitian positive definite matrices are supported. |
def logdet(matrix, name=None): + """Computes log of the determinant of a hermitian positive definite matrix. + + ```python + # Compute the determinant of a matrix while reducing the chance of over- or + underflow: + A = ... # shape 10 x 10 + det = tf.exp(tf.logdet(A)) # scalar + ``` + + Args: + matrix: A `Tensor`. Must be `float32`, `float64`, `complex64`, or + `complex128` with shape `[..., M, M]`. + name: A name to give this `Op`. Defaults to `logdet`. + + Returns: + The natural log of the determinant of `matrix`. + + @compatibility(numpy) + Equivalent to numpy.linalg.slogdet, although no sign is returned since only + hermitian positive definite matrices are supported. + @end_compatibility + """ + # This uses the property that the log det(A) = 2*sum(log(real(diag(C)))) + # where C is the cholesky decomposition of A. + with ops.name_scope(name, 'logdet', [matrix]): + chol = gen_linalg_ops.cholesky(matrix) + return 2.0 * math_ops.reduce_sum( + math_ops.log(math_ops.real(array_ops.matrix_diag_part(chol))), + reduction_indices=[-1]) |
||||||||
119 | Tensorflow/Keras | b85601b95eba28605d3de076fa70cabf2f2e32b9 | Python | fix | loss of precision | incorrect result | In probability theory, an ƒ-divergence is a function Df (P || Q) that measures the difference between two probability distributions P and Q. | If probability distribution Q is not reparameterized, TensorFlow's gradient will be incorrect since the chain-rule stops at samples of unreparameterized distributions | other | probability | ƒ-divergence | use a different algorithm | use a different algorithm | Improve score-trick to be a valid Csiszar f-Divergence yet numerically stable. Using the Score-Gradient trick results in an unbiased gradient | nabla E_q[f(X)] - = nabla int dx q(x) f(x) - = int dx nabla [ q(x) f(x) ] - = int dx q'(x) f(x) + q(x) f'(x) - = int dx q(x) nabla [ log(q(x)) stopgrad[f(x)] + f(x) ] - = E_q[ nabla [ log(q(X)) stopgrad[f(X)] + f(X) ] ] - ~= Avg{ log(q(y_i)) stopgrad[f(y_i)] + f(y_i) : y_i = stopgrad[x_i], x_i ~ q} |
grad[ E_q[f(X)] ] + = grad[ int dx q(x) f(x) ] + = int dx grad[ q(x) f(x) ] + = int dx [ q'(x) f(x) + q(x) f'(x) ] + = int dx q(x) grad[ f(x) q(x) / stop_grad[q(x)] ] + = E_q[ grad[ f(x) q(x) / stop_grad[q(x)] ] ] |
||||
120 | Tensorflow/Keras | e6126230200e2ce9c96da5c9e4dc7f104c645d11 | Python | fix | overflow/underflow | overflow/underflow | for very small or very large numbers naive direct computation of log of sum of exponentials has a risk of underflow and overflow respectively | other | other | Gaussian mixture model, log probability | rewrite math formula | rewrite math formula | Use Tensorflow log(sum(exp)) function to work in log scale which is numerically stable than log -> sum ->exp for calculating log probability | self._prior_probs[shard_id] = math_ops.log( - math_ops.reduce_sum( - math_ops.exp(self._probs[shard_id]), 1, keep_dims=True)) |
self._prior_probs[shard_id] = math_ops.reduce_logsumexp( + self._probs[shard_id], axis=1, keep_dims=True) |
def test_random_input_large(self): + # sklearn version. + iterations = 5 # that should be enough to know whether this diverges + np.random.seed(5) + num_classes = 20 + x = np.array([[np.random.random() for _ in range(100)] + for _ in range(num_classes)], dtype=np.float32) + + # skflow version. + gmm = gmm_lib.GMM(num_classes, + covariance_type='full', + config=run_config.RunConfig(tf_random_seed=2)) + + def get_input_fn(x): + def input_fn(): + return constant_op.constant(x.astype(np.float32)), None + return input_fn + + gmm.fit(input_fn=get_input_fn(x), steps=iterations) + self.assertFalse(np.isnan(gmm.clusters()).any()) |
log sum of exp | |||
121 | Tensorflow/Keras | fdbd02c8d7f07bd1207938662716fad8857dcd55 | Python | fix | loss of precision | deals with the shift parameter, but this feature is not available in TF now | tensor math | statistics | mean, variance | rewrite math formula | rewrite math formula | change the shift value for calculating mean | shift = math_ops.cast(shift, dtypes.float32) if ( - shift is not None and x.dtype == dtypes.float16) else shift |
if shift is None: + # Compute true mean while keeping the dims for proper broadcasting. + shift = array_ops.stop_gradient( + math_ops.reduce_mean(y, axes, keep_dims=True)) + else: + shift = math_ops.cast(shift, y.dtype) + # Reshape shift as needed. + shift = array_ops.reshape(shift, array_ops.shape(m_ss)) + shift.set_shape(m_ss.get_shape()) |
||||||
122 | Tensorflow/Keras | 7c97f13ace37ac73bb820dec941c55ae4d538581 | Python | fix | underflow | Student's t-distribution is defined as the distribution of the random variable t which is (very loosely) the "best" that we can do not knowing sigma. | statistical distributions | statistical distributions | student t distribution log probability | rewrite math formula | rewrite math formula | use log1p instead of log. THe function log1p computes natural logarithm of (1 + x) element-wise. | def _log_prob(self, x): y = (x - self.mu) / self.sigma half_df = 0.5 * self.df return (math_ops.lgamma(0.5 + half_df) - math_ops.lgamma(half_df) - 0.5 * math_ops.log(self.df) - 0.5 * math.log(math.pi) - math_ops.log(self.sigma) - (0.5 + half_df) * math_ops.log(1. + math_ops.square(y) / self.df)) |
def _log_prob(self, x): return self._log_unnormalized_prob(x) - self._log_normalization() def _log_unnormalized_prob(self, x): y = (x - self.mu) / self.sigma # Abs(sigma) superfluous. return -0.5 * (self.df + 1.) * math_ops.log1p(y**2. / self.df) def _log_normalization(self): return (math_ops.log(math_ops.abs(self.sigma)) + 0.5 * math_ops.log(self.df) + 0.5 * np.log(np.pi) + math_ops.lgamma(0.5 * self.df) - math_ops.lgamma(0.5 * (self.df + 1.))) |
log | |||||
123 | Tensorflow/Keras | de6ce1de08ea97d599687fbbe5196ca4af5232ae | C++ | fix | overflow | large logit values were not properly handled in multinomial distribution | statistical distributions | statistical distributions | Multinomial distribution | rewrite math formula | rewrite math formula | subtract a maximum from logits before taking exponentials | running_total += std::exp(static_cast<float>(logits_row[j])) | // Takes an along-class maximum (for numerical stability). + T max = std::numeric_limits<T>::lowest(); + for (int64 j = 0; j < num_classes; ++j) { + if (std::isfinite(static_cast<float>(logits_row[j]))) { + max = std::max(max, logits_row[j]); + } + } + const float max_logit = static_cast<float>(max); running_total += std::exp(static_cast<float>(logits_row[j]) - max_logit); |
def testLargeLogits(self): + for neg in [True, False]: + with self.test_session(use_gpu=self.use_gpu): + logits = np.array([[1000.] * 5]) + if neg: + logits *= -1 + samples = tf.multinomial(logits, 10).eval() + # Sampled classes should be in-range. + self.assertTrue((samples >= 0).all()) + self.assertTrue((samples < 5).all()) |
exponential | ||||
124 | Tensorflow/Keras | e47dc8593d11be8cd82767965b8b75b6307c07e4 | Python | fix | loss of precision | There is evidence that the 'shift' strategy in computing the sufficient statistics of the moments is actually leading to worse numerical stability for batch normalization. |
tensor math | statistics | mean, variance | other | amend algorithm | set shift parameter as a non-default argument in moments method that calculates mean and variance and is utilized in batch normalization | def sufficient_statistics(x, axes, shift=True, keep_dims=False, name=None) | def sufficient_statistics(x, axes, shift=False, keep_dims=False, name=None) | variance, mean | |||||
125 | Tensorflow/Keras | ab1165c4908b70441f1ddea24821a8b84a806ddc | C++ | fix | overflow/underflow | Legalization is the phase in code generation that eradicates any instructions that are not supported by the target. | Multi-Level IR Compiler Framework | activation functions | activation functions | sigmoid, compiler | other | amend algorithm | This function converts Sigmoid op to HLO ops computing sigmoid | class ConvertSigmoidOp : public OpRewritePattern<TF::SigmoidOp> { - using OpRewritePattern::OpRewritePattern; - - LogicalResult matchAndRewrite(TF::SigmoidOp op, |
class ConvertSigmoidOp : public RewritePattern { public: + explicit ConvertSigmoidOp(MLIRContext *context) + : RewritePattern( + TF::SigmoidOp::getOperationName(), 0, context, + {mhlo::ConstOp::getOperationName(), + shape::ShapeOfOp::getOperationName(), + shape::ToExtentTensorOp::getOperationName(), + mhlo::DynamicBroadcastInDimOp::getOperationName(), + mhlo::MulOp::getOperationName(), mhlo::TanhOp::getOperationName(), + mhlo::AddOp::getOperationName()}) {} + + LogicalResult matchAndRewrite(Operation *sigmoid_op, PatternRewriter &rewriter) const override { + auto op = cast<TF::SigmoidOp>(sigmoid_op); |
|||||
126 | Tensorflow/Keras | 6acd86d539464b611d37b8dc13251fafab25fb5c | C++ | fix | loss of precision | tensor math | tensor math | argmin | rewrite math formula | rewrite math formula | amend logic for tie breaking | |||||||||
127 | Tensorflow/Keras | f73e9d61a7c577a5182701d3aa5bba8d6d69f87d | C++ | fix | loss of precision | tensor math | tensor math | argmin, argmax | rewrite math formula | rewrite math formula | amend logic for tie breaking | |||||||||
128 | Tensorflow/Keras | ee85e6d230278e763a2784ba86acc747abdb2242 | C++ | fix | loss of precision | MeanStddevNormalization is numerically unstable | tensor math | statistics | variance | use a different algorithm | use a different algorithm | Use the numerically stable two-pass algorithm to calculate variance in MeanStddevNormalization. | float sum_sq = 0.0f; sum_sq += input_vector[i] * input_vector[i]; } const float variance = sum_sq / v_size - mean * mean; |
float sum_diff_sq = 0.0f; + for (int i = 0; i < v_size; ++i) { + const float diff = input_vector[i] - mean; + sum_diff_sq += diff * diff; + } + const float variance = sum_diff_sq / v_size; |
test accuracy for // small mean, small variance / small mean, large variance // large mean, zero variance // large mean, small variance / large mean, large variance |
|||||
129 | Tensorflow/Keras | f42d9846f6942e497645af28b3506e6163bdc8bf | C++ | fix | underflow | mel spectrogram is a spectrogram where the frequencies are converted to the mel scale. It is used in signal processing and it involves mapping audio signal from the time to frequency domain using fast fourier transform | loss functions | loss functions | logistic loss, uniform distribution sampling, Mel-Frequency Cepstral Coefficient (MFCC) calculation | use a different algorithm | use a different algorithm | Replace log(1 + x) with numerically more stable log1p(x) | LogisticLossUpdater : public DualLossUpdater { - return log(1 + exp(-y_wx)) * example_weight; double MfccMelFilterbank::FreqToMel(double freq) const { - return 1127.0 * log(1.0 + (freq / 700.0)); LogUniformSampler::LogUniformSampler(int64 range) - : RangeSampler(range), log_range_(log(range + 1)) {} static float FreqToMel(float freq) { - return 1127.0 * log(1.0 + (freq / 700.0)); double MfccMelFilterbank::FreqToMel(double freq) const { - return 1127.0 * log(1.0 + (freq / 700.0)); |
class LogisticLossUpdater : public DualLossUpdater { + return log1p(exp(-y_wx)) * example_weight; double MfccMelFilterbank::FreqToMel(double freq) const { + return 1127.0 * log1p(freq / 700.0); LogUniformSampler::LogUniformSampler(int64 range) + : RangeSampler(range), log_range_(log1p(range)) {} static float FreqToMel(float freq) { + return 1127.0 * log1p(freq / 700.0); double MfccMelFilterbank::FreqToMel(double freq) const { + return 1127.0 * log1p(freq / 700.0); |
||||||
130 | Tensorflow/Keras | 0fe671dd0a14614edbbd50397777def3bff770cc | Cuda | fix | loss of precision | Eigen MeanReducer is numerically unstable due to unstable summation operation. Summing numbers of different magnitude leads to loss of precision, numbers should be sorted | tensor math | statistics | mean | use a different algorithm | use a different algorithm | Don't use the numerically unstable MeanReducer class in Eigen. | ||||||||
131 | Tensorflow/Keras | f84e8257aa88fa45cc7a15835ad386565cd60237 | C++ | fix | loss of precision | Eigen MeanReducer is numerically unstable due to unstable summation operation. Summing numbers of different magnitude leads to loss of precision, numbers should be sorted | CNN operations | pooling layer | mean, average pooling | use a different algorithm | use a different algorithm | Change the Eigen reduction code to use a tree to improve numerical stability. This changes the InnerMostDimReducer to use a summation tree, which is more numerically stable than the previous approach of sequential addition into an accumulator. This solves the issue for reduction over all or a trailing subset of dimensions. This change does not improve the numerical accuracy for MeanReducer, which maintains state. Benchmarks show a 40% (AVX) to 50% (SSE) slowdown for small row reductions (sum, float). column- and full reductions are unchanged. |
||||||||
132 | Tensorflow/Keras | fa2132ab65f92ea40c94152dba105a9f86a0a555 | Python | fix | loss of precision | unsorted sum is numerically unstable | gradients/derivatives | gradients | gradients, hessians, boosted trees | increase variable precision/change variable type | increase variable precision | Use 64bit aggregation for gradients and hessians since the 32 bit version is numerically unstable for large minibatches. | per_partition_hessians = math_ops.unsorted_segment_sum( - hessians, mapped_partitions, array_ops.size(unique_partitions)) |
# Since unsorted_segment_sum can be numerically unstable, use 64bit + # operation. + gradients64 = math_ops.cast(gradients, dtypes.float64) + hessians64 = math_ops.cast(hessians, dtypes.float64) per_partition_gradients = math_ops.unsorted_segment_sum( + gradients64, mapped_partitions, array_ops.size(unique_partitions)) per_partition_hessians = math_ops.unsorted_segment_sum( + hessians64, mapped_partitions, array_ops.size(unique_partitions)) + per_partition_gradients = math_ops.cast(per_partition_gradients, + dtypes.float32) + per_partition_hessians = math_ops.cast(per_partition_hessians, + dtypes.float32) |
sum | |||||
133 | Tensorflow/Keras | 48adc7ba73177f2a9331918b160bc3d0775985b8 | Python | fix | underflow | square root | linear algebra | norm | L2 norm | rewrite math formula | rewrite math formula | Avoid potentially numerically unstable square root in the linalg_ops.norm() | mean = math_ops.square(linalg_ops.norm(m - m_w)) # This uses the L2 norm. | mean = math_ops.reduce_sum( + math_ops.squared_difference(m, m_w)) # Equivalent to L2 but more stable. |
square root, mean | |||||
134 | Tensorflow/Keras | 18f860fd8e1fdffd80633cf5ac32f895423dfa8d | C++ | fix | underflow | In computing, a normal number is a non-zero number in a floating-point representation which is within the balanced range supported by a given floating-point format: it is a floating point number that can be represented without leading zeros in its significand. | other | random number generator | random number generator | limit input range | limit input range | Create uniform numbers between 1 and 1.125 instead of -0.9 and 1.0 to avoid creating denormal numbers. | std::uniform_real_distribution<FloatT> generator(-0.9f, 1.0f); [&](tensorflow::gtl::ArraySlice<int64> /*indices*/) { - return generator(engine); |
std::uniform_real_distribution<FloatT> generator(1.0f, 1.125f); [&](tensorflow::gtl::ArraySlice<int64> indices) { + // Generate a random uniforma number from -0.0625 and 0.0625 and bias it + // with a position dependent nubmer with mean 0.037109375. These number + // should allow for long chains of accumulation without being too close + // to zero or to large to accumulate all numbers accurately. + return (generator(engine) - 1.0625) + + static_cast<FloatT>(Product(indices) % 113 - 47) / + static_cast<FloatT>(256.0f); |
||||||
135 | Tensorflow/Keras | 6db014b44863bab616f026beab461fd646fcb505 | C++ | fix | overflow/underflow | gradients/derivatives | gradients | gradients testing | other | revert commit | N/A | TEST_F(NaryGradTest, Select) { + TensorShape shape({3, 2}); + auto x1 = Placeholder(scope_, DT_FLOAT, Placeholder::Shape(shape)); + auto x2 = Placeholder(scope_, DT_FLOAT, Placeholder::Shape(shape)); + // Use constant values to avoid instability when computing + Tensor c = + test::AsTensor<float>({-3.5f, 1.5f, -1.2f, 3.0f, -2.5f, 2.8f}, {3, 2}); + auto zero = Cast(scope_, Const(scope_, 0.0), c.dtype()); + auto y = Where3(scope_, Greater(scope_, c, zero), x1, x2); + RunTest({x1, x2}, {shape, shape}, {y}, {shape}); +} Status SelectGrad(const Scope& scope, const Operation& op, + const std::vector<Output>& grad_inputs, + std::vector<Output>* grad_outputs) { + auto comparator = op.input(0); + auto x = op.input(1); + auto zeros = ZerosLike(scope, x); + auto grad = grad_inputs[0]; + + auto gx_1 = Where3(scope, comparator, grad, zeros); + auto gx_2 = Where3(scope, comparator, zeros, grad); + + grad_outputs->push_back(NoGradient()); + grad_outputs->push_back(gx_1); + grad_outputs->push_back(gx_2); + return scope.status(); +} +REGISTER_GRADIENT_OP("Select", SelectGrad); |
||||||||
136 | Tensorflow/Keras | 1bbec9e4e9c5d3fbbc2fa2b58841435e86dbf76a | Cuda | fix | overflow | linear algebra | linear algebra | log determinant | use a different algorithm | use a different algorithm | Compute Determinant from a partially pivoted LU factorization Change behavior for Determinant on matrices with (numerically) infinite determinants to match the behavior of numpy.linalg.det: Return inf for matrix with infinite determinant. |
|||||||||
137 | Tensorflow/Keras | 265483857be3ca84b992937490ea8f0591b2d4ab | Python | fix | overflow/underflow | statistical distributions | statistical distributions | laplace distribution | use a different algorithm | use a different algorithm | Add more stable calculation of Log of the cumulative distribution function and log survival function | |||||||||
138 | Tensorflow/Keras | e37e792d3eb2dac7ac627b7d8d56d69360649d19 | Python | fix | loss of precision | The raw formulation of cross-entropy, tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)), reduction_indices=[1])) can be numerically unstable. |
loss functions | loss functions | cross entropy loss | rewrite math formula | rewrite math formula | we apply +`tf.nn.softmax_cross_entropy_with_logits` on the unnormalized logits (e.g., we +call `softmax_cross_entropy_with_logits` on `tf.matmul(x, W) + b`), because this +more numerically stable function internally computes the softmax activation. |
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])) | cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_)) | ||||||
139 | Tensorflow/Keras | a89c54d57209f91161fa450605f645c9124d89ac | Python | fix | loss of precision | statistical distributions | statistical distributions | Bernoulli distribution | use a different algorithm | use a different algorithm | use logits to create bernouli distribution | |||||||||
140 | Tensorflow/Keras | 14066c4b84e56c3b86f6152de1bb80df22341aa8 | Python | fix | overflow/underflow | statistical distributions | statistical distributions | log determinant, multivariate normal distribution | use a different algorithm | use a different algorithm | Compute log_determinant instead of determinant in mvn to make stable (w.r.t. under/over flow). | -def _determinant_from_sigma_chol(sigma_chol): det_last_dim = array_ops.rank(sigma_chol) - 2 sigma_batch_diag = array_ops.batch_matrix_diag_part(sigma_chol) - det = math_ops.square(math_ops.reduce_prod( - sigma_batch_diag, reduction_indices=det_last_dim)) - det.set_shape(sigma_chol.get_shape()[:-2]) - return det |
def _log_determinant_from_sigma_chol(sigma_chol): det_last_dim = array_ops.rank(sigma_chol) - 2 sigma_batch_diag = array_ops.batch_matrix_diag_part(sigma_chol) + log_det = 2.0 * math_ops.reduce_sum( + math_ops.log(sigma_batch_diag), reduction_indices=det_last_dim) + log_det.set_shape(sigma_chol.get_shape()[:-2]) + return log_det |
|||||||
141 | Tensorflow/Keras | bce6216610d57f8f4b1e9e79836737df109c4e42 | Python | fix | loss of precision | tensor math | statistics | variance with shifted data | use a different algorithm | use a different algorithm | ||||||||||
142 | Tensorflow/Keras | 66f452d2217b155b697fc6d6cef5f56599ee2bbc | C++ | fix | overflow | overflow | Only enable the HoistCommonFactorOutOfAggregation rewrite in aggressive mode, since it changes program behavior w.r.t. over- and underflow. For example, it will rewrite "0.5*x + 0.5*y" to "0.5*(x + y)", which will overflow if x + y > FLT_MAX, while the original expression does not overflow unless x + y > 2*FLT_MAX. | optimizers | optimizers | arithmetic optimizer | other | allow code rewriting only in agressive mode | Only enable the HoistCommonFactorOutOfAggregation rewrite in aggressive mode | |||||||
143 | Tensorflow/Keras | 12243e6b65958c2e0c344aa3df4875f472ce5ae0 | C++ | fix | overflow | overflow | other | computational graph | analytical cost estimator | add overflow check | add overflow check | Fix integer-overflow in `tensorflow::grappler::AnalyticalCostEstimator::PredictCosts` by using MultiplyWithoutOverflow. MultiplyWithoutOverflow multiplies unsigned ints since signed overflow is undefined and has a check fo integer overflow. Return nullop if overflow | size *= std::max<int64>(1, dim.size()); | size = MultiplyWithoutOverflow(size, std::max<int64>(1, dim.size())); if (size < 0) { return errors::InvalidArgument( "Integer overflow encountered in dimension size."); } |
||||||
144 | Tensorflow/Keras | cc464f04caa327d3f62d2f793a428cb7b0f0a5d7 | Python | unit test | overflow | overflow | linear algebra | linear algebra | array product | limit input range | limit input range | Limit input values to avoid integer overflow in reduction_ops_test. | # overflow, divide the incremental int32 array by 2. - for rank in range(1, _MAX_RANK + 1): - np_arr = self._makeIncremental((2,) * rank, dtypes.int32) / 2 def testInt64(self): - for rank in range(1, _MAX_RANK + 1): - np_arr = self._makeIncremental((2,) * rank, dtypes.int64) |
# overflow, limit array values. + for rank in range(1, _MAX_RANK): + np_arr = self._makeIncremental((2,) * rank, dtypes.int32) % 5 + 1 def testInt64(self): + for rank in range(1, _MAX_RANK): + # Avoid overflow by limiting array values. + np_arr = self._makeIncremental((2,) * rank, dtypes.int64) % 11 + 1 |
||||||
145 | Tensorflow/Keras | 9d40a1573849b7e21d4f2d359fd9e87c40e33c0e | Python | Disable test | overflow | overflow | tensor math | tensor math | division, mod testing | disable test/warning | disable test for division and mod | Temporarily disable div overflow edge case due to ASAN failure. | ||||||||
146 | Tensorflow/Keras | b47be308c4b5ac7babd6400a8fb40c3d8bf163d6 | C++ | fix | overflow | overflow | The original implementations of `google_floor_div`, XLA `FloorDiv` and MLIR `TF_FloorDivOp` all suffered from overflows for `abs(x) + abs(y) > INT_MAX |
tensor math | tensor math | floor division | rewrite math formula | rewrite math formula | Rewrite formula to T z = x / y return (z * y != x && (x < 0) != (y < 0)) ? z - 1 : z |
def intEdgeTestData(self, dtype): + """Edge-case test data for integer types.""" + nums = np.array([np.iinfo(dtype).min, -1, 1, + np.iinfo(dtype).max], + dtype=dtype).reshape([4, 1]) + divs = nums.reshape([1, 4]) + return nums, divs + + def testFloorDivModIntEdges(self): + for dtype in [np.int32, np.int64]: + x, y = self.intEdgeTestData(dtype) + tf_floor_div = math_ops.floor_div(x, y) + np_floor_div = self.numpySafeFloorDivInt(x, y) + self.assertAllEqual(tf_floor_div, np_floor_div) + tf_floor_mod = math_ops.floormod(x, y) + np_floor_mod = self.numpySafeFloorModInt(x, y) + self.assertAllEqual(tf_floor_mod, np_floor_mod) + z = math_ops.add(math_ops.multiply(tf_floor_div, y), tf_floor_mod) + # x = floor_div(x, y) * y + floor_mod(x, y) + self.assertAllEqual(z, np.broadcast_to(x, z.shape)) + + def testTruncateDivModIntEdges(self): + for dtype in [np.int32, np.int64]: + x, y = self.intEdgeTestData(dtype) + tf_truncate_div = math_ops.truncatediv(x, y) + np_truncate_div = self.numpySafeTruncateDivInt(x, y) + self.assertAllEqual(tf_truncate_div, np_truncate_div) + tf_truncate_mod = math_ops.truncatemod(x, y) + np_truncate_mod = self.numpySafeTruncateModInt(x, y) + self.assertAllEqual(tf_truncate_mod, np_truncate_mod) + z = math_ops.add(math_ops.multiply(tf_truncate_div, y), tf_truncate_mod) + # x = truncatediv(x, y) * y + truncatemod(x, y) + self.assertAllEqual(z, np.broadcast_to(x, z.shape)) |
||||||
147 | Tensorflow/Keras | 4c0ee937c0f61c4fc5f5d32d9bb4c67428012a60 | C++ | fix | overflow | overflow | other | sparse operations | sparse operations | use a different algorithm | use a different algorithm | Prevent overflow by constructing the dense shape separately | sparse::SparseTensor sparse_tensor; OP_REQUIRES_OK(context, - sparse::SparseTensor::Create( - input_indices, input_values, - TensorShape(input_shape.vec<int64>()), &sparse_tensor)); |
TensorShape dense_shape; + const auto input_shape_flat = input_shape.flat<int64>(); + for (int i = 0; i < input_shape.NumElements(); i++) { + OP_REQUIRES_OK(context, + dense_shape.AddDimWithStatus(input_shape_flat(i))); + } sparse::SparseTensor sparse_tensor; OP_REQUIRES_OK(context, TensorShape(input_shape.vec<int64>()), &sparse_tensor)); + sparse::SparseTensor::Create(input_indices, input_values, + dense_shape, &sparse_tensor)); |
||||||
148 | Tensorflow/Keras | 7c8cc4ec69cd348e44ad6a2699057ca88faad3e5 | C++ | fix | overflow | overflow | Op that looks up items from a sparse tensor in an embedding matrix. The sparse lookup tensor is represented by three individual tensors: lookup, indices, and dense_shape. | integer overflow | other | sparse operations | sparse operations, embedding | add overflow check | add overflow check | ensure that output is not a null pointer that indicates overflow | N/A | TF_LITE_ENSURE(context, output_shape != nullptr); | ||||
149 | Tensorflow/Keras | 37054f9134af917ded7f40c7d663fa490d85c7d4 | C++ | fix | overflow | overflow | activation functions | activation functions | range of activation function, quantization | add overflow check | add overflow check | Add extra robustness by adding more overflow checks to CalculateActivationRangeQuantized for cases where output tensor has bad, but still valid quantization parameters, which cause integer overflow. |
void CalculateActivationRangeQuantizedImpl(TfLiteFusedActivation activation, - int32_t qmin, int32_t qmax, - TfLiteTensor* output, - int32_t* act_min, int32_t* act_max) { if (activation == kTfLiteActRelu) { - *act_min = std::max(qmin, quantize(0.0)); } else if (activation == kTfLiteActRelu6) { - *act_min = std::max(qmin, quantize(0.0)); - *act_max = std::min(qmax, quantize(6.0)); } else if (activation == kTfLiteActReluN1To1) { - *act_min = std::max(qmin, quantize(-1.0)); - *act_max = std::min(qmax, quantize(1.0)); |
inline TfLiteStatus Quantize(TfLiteContext* context, float scale, + int32_t zero_point, float f, int32_t& q) { + const float tmp = TfLiteRound(f / scale); + const bool no_integer_overflow_from_quantization = + (tmp >= std::numeric_limits<int32_t>::min() && + tmp <= std::numeric_limits<int32_t>::max()); + TF_LITE_ENSURE(context, no_integer_overflow_from_quantization); + q = zero_point + static_cast<int32_t>(tmp); + return kTfLiteOk; +} + +TfLiteStatus CalculateActivationRangeQuantizedImpl( + TfLiteContext* context, TfLiteFusedActivation activation, int32_t qmin, + int32_t qmax, TfLiteTensor* output, int32_t* act_min, int32_t* act_max) { + int32_t tmp_q; if (activation == kTfLiteActRelu) { + TF_LITE_ENSURE_OK(context, + Quantize(context, scale, zero_point, 0.0, tmp_q)); + *act_min = std::max(qmin, tmp_q); *act_max = qmax; } else if (activation == kTfLiteActRelu6) { + TF_LITE_ENSURE_OK(context, + Quantize(context, scale, zero_point, 0.0, tmp_q)); + *act_min = std::max(qmin, tmp_q); + TF_LITE_ENSURE_OK(context, + Quantize(context, scale, zero_point, 6.0, tmp_q)); + *act_max = std::min(qmax, tmp_q); } else if (activation == kTfLiteActReluN1To1) { + TF_LITE_ENSURE_OK(context, + Quantize(context, scale, zero_point, -1.0, tmp_q)); + *act_min = std::max(qmin, tmp_q); + TF_LITE_ENSURE_OK(context, + Quantize(context, scale, zero_point, 1.0, tmp_q)); + *act_max = std::min(qmax, tmp_q); } else { *act_min = qmin; *act_max = qmax; } + return kTfLiteOk; |
TEST_F(KernelUtilTest, ActivationRangeQuantizedOverflow) { + // Create output. + TfLiteTensor output = {}; + output.type = kTfLiteUInt8; + output.allocation_type = kTfLiteArenaRw; + output.dims = nullptr; + TfLiteQuantizationParams output_quant = {1e-10, -128}; + output.params = output_quant; + output.quantization.type = kTfLiteAffineQuantization; + auto* output_params = reinterpret_cast<TfLiteAffineQuantization*>( + malloc(sizeof(TfLiteAffineQuantization))); + output_params->scale = TfLiteFloatArrayCreate(1); + output_params->scale->data[0] = 1; + output_params->zero_point = TfLiteIntArrayCreate(1); + output_params->zero_point->data[0] = -128; + output.quantization.params = reinterpret_cast<void*>(output_params); + + // For bounded activation, a too small scale value may cause overflow. + // Make sure overflow error is handled gracefully. + int32_t act_min, act_max; + ASSERT_EQ(kTfLiteOk, + CalculateActivationRangeQuantized(&context_, kTfLiteActRelu, + &output, &act_min, &act_max)); + ASSERT_NE(kTfLiteOk, + CalculateActivationRangeQuantized(&context_, kTfLiteActRelu6, + &output, &act_min, &act_max)); + EXPECT_TRUE(absl::StrContains( + context_.error, "no_integer_overflow_from_quantization was not true")); + ASSERT_NE(kTfLiteOk, + CalculateActivationRangeQuantized(&context_, kTfLiteActReluN1To1, + &output, &act_min, &act_max)); + EXPECT_TRUE(absl::StrContains( + context_.error, "no_integer_overflow_from_quantization was not true")); + + // Release. + TfLiteTensorFree(&output); |
|||||
150 | Tensorflow/Keras | 4253f96a58486ffe84b61c0415bb234a4632ee73 | C++ | fix | overflow | overflow | integer overflow | other | other | concatenate | limit input range | limit input range | TfLiteStatus Prepare(TfLiteContext* context, TfLiteNode* node) { TF_LITE_ENSURE_EQ(context, t->type, input_type); for (int d = 0; d < t0->dims->size; ++d) { if (d == axis) { + // Avoid integer overflow in sum_axis below + TF_LITE_ENSURE(context, t->dims->data[axis] >= 0); + TF_LITE_ENSURE(context, t->dims->data[axis] <= + std::numeric_limits<int>::max() - sum_axis); |
sum | ||||||
151 | Tensorflow/Keras | 704866eabe03a9aeda044ec91a8d0c83fc1ebdbe | C++ | fix | overflow | overflow | other | other | join segments | limit input range | limit input range | OP_REQUIRES(context, num_segments_tensor.NumElements() != 0, + errors::InvalidArgument("Number of segments cannot be empty.")); |
||||||||
152 | Tensorflow/Keras | 87cf4d3ea9949051e50ca3f071fc909538a51cd0 | C++ | fix | overflow | overflow | other | sparse operations | sparse operations (concat) | add overflow check | add overflow check | break if overflow occured | bool overflow_ocurred = false; for (int i = 0; i < N; i++) { + int64 new_num_elements = 1; OP_REQUIRES(context, TensorShapeUtils::IsVector(shapes[i].shape()), errors::InvalidArgument( "Input shapes should be a vector but received shape ", shapes[i].shape().DebugString(), " at position ", i)); + auto input_shape_vector = shapes[i].vec<int64>(); + for (int j = 0; j < input_shape_vector.size(); j++) { + new_num_elements = + MultiplyWithoutOverflow(new_num_elements, input_shape_vector(j)); + if (new_num_elements < 0) { + overflow_ocurred = true; + break; + } + } + + if (overflow_ocurred) { + break; + } } + OP_REQUIRES( + context, !overflow_ocurred, + errors::Internal("Encountered overflow from large input shape.")); |
|||||||
153 | Tensorflow/Keras | b432a38fe0e1b4b904a6c222cbce794c39703e87 | C++ | unit test | overflow | overflow | data processing | image processing | bounding boxes for image processing | limit input range | limit input range | replace check with require valid arguments into draw bounding box function and add useful error messages if argument invalid | CHECK_GE(min_box_row_clamp, 0); - CHECK_GE(max_box_row_clamp, 0); - CHECK_LT(min_box_row_clamp, height); - CHECK_LT(max_box_row_clamp, height); - CHECK_GE(min_box_col_clamp, 0); - CHECK_GE(max_box_col_clamp, 0); - CHECK_LT(min_box_col_clamp, width); - CHECK_LT(max_box_col_clamp, width); - CHECK_LT(min_box_row, height); - CHECK_GE(max_box_row, 0); - CHECK_LT(min_box_col, width); - CHECK_GE(max_box_col, 0); |
OP_REQUIRES( + context, min_box_row_clamp >= 0, + errors::InvalidArgument("Min box row clamp is less than 0.")); + OP_REQUIRES( + context, max_box_row_clamp >= 0, + errors::InvalidArgument("Max box row clamp is less than 0.")); + OP_REQUIRES(context, min_box_row_clamp <= height, + errors::InvalidArgument( + "Min box row clamp is greater than height.")); + OP_REQUIRES(context, max_box_row_clamp <= height, + errors::InvalidArgument( + "Max box row clamp is greater than height.")); + + OP_REQUIRES( + context, min_box_col_clamp >= 0, + errors::InvalidArgument("Min box col clamp is less than 0.")); + OP_REQUIRES( + context, max_box_col_clamp >= 0, + errors::InvalidArgument("Max box col clamp is less than 0.")); + OP_REQUIRES(context, min_box_col_clamp <= width, + errors::InvalidArgument( + "Min box col clamp is greater than width.")); + OP_REQUIRES(context, max_box_col_clamp <= width, + errors::InvalidArgument( + "Max box col clamp is greater than width.")); + OP_REQUIRES( + context, min_box_row <= height, + errors::InvalidArgument("Min box row is greater than height.")); + OP_REQUIRES(context, max_box_row >= 0, + errors::InvalidArgument("Max box row is less than 0.")); + OP_REQUIRES( + context, min_box_col <= width, + errors::InvalidArgument("Min box col is greater than width.")); + OP_REQUIRES(context, max_box_col >= 0, + errors::InvalidArgument("Max box col is less than 0.")); |
||||||
154 | Tensorflow/Keras | 87d2b9751513253058be671313db3e32cc13842a | C++ | unit test | overflow | overflow | other | sparse operations | sparse operations (concat) | fix test/warning | fix overflow check | shapes.size())); - bool overflow_ocurred = false; for (int i = 0; i < N; i++) { - int new_num_elements = 1; OP_REQUIRES(context, TensorShapeUtils::IsVector(shapes[i].shape()), errors::InvalidArgument( "Input shapes should be a vector but received shape ", shapes[i].shape().DebugString(), " at position ", i)); - auto input_shape_vector = shapes[i].vec<int64>(); - for (int j = 0; j < input_shape_vector.size(); j++) { - new_num_elements = - MultiplyWithoutOverflow(new_num_elements, input_shape_vector(j)); - if (new_num_elements < 0) { - overflow_ocurred = true; - break; - } - } - - if (overflow_ocurred) { - break; - } } - OP_REQUIRES( - context, !overflow_ocurred, - errors::Internal("Encountered overflow from large input shape.")); |
N/A | |||||||
155 | Tensorflow/Keras | 7bb2d255e6d404cbfa528d0ffc2f22248e6c1b21 | Cuda | fix | overflow | overflow | other | sparse operations | sparse to dense operation | increase variable precision/change variable type | change variable type | use int 64 as index | Index output_idx = indices[thread_idx * ndims + ndims - 1]; | int64 output_idx = indices[thread_idx * ndims + ndims - 1]; | ||||||
156 | Tensorflow/Keras | dc4d330cfe25bbb0c3e4759dadfb16d4715f338a | C++ | fix | underflow | underflow | If a complex value's squared norm was denormal but had a non-zero imaginary part, the Householder reflection computation could yield NaNs. | linear algebra | norm | norm | use a different algorithm | use a different algorithm | By using a more accurate norm, we can avoid the underflow. The new norm Computes sqrt(x^2 + y^2 + ...), avoiding overflow/underflow | auto mu = Sqrt(Real(alpha * Conj(alpha)) + sigma); | XlaOp Norm(std::vector<XlaOp> xs) { + CHECK(!xs.empty()); + XlaOp w; + for (size_t i = 0; i < xs.size(); ++i) { + xs[i] = Abs(xs[i]); + w = i == 0 ? xs[i] : xla::Max(w, xs[i]); + } + + XlaOp out; + for (size_t i = 0; i < xs.size(); ++i) { + XlaOp t = Square(xs[i] / w); + out = i == 0 ? t : xla::Add(out, t); + } + return Select(Eq(w, ZerosLike(w)), ZerosLike(w), w * Sqrt(out)); auto mu = Norm({Real(alpha), Imag(alpha), Sqrt(sigma)}); |
|||||
157 | Tensorflow/Keras | ff6601a943db5f71fda09210e67ba8e9fd839ae8 | C++ | fix | overflow | overflow | CombinedNonMaxSuppression greedily selects a subset of bounding boxes in descending order of score | data processing | image processing | non_max_suppression | increase variable precision/change variable type | change variable type | Use an int32 scalar as the default type for representing the maximum number of boxes retained over all classes. If int32 should overflow, use int64 | max_total_size = ops.convert_to_tensor(max_total_size) |
class CombinedNonMaxSuppressionTest(test_util.TensorFlowTestCase): + + # NOTE(b/142795960): parameterized tests do not work well with tf.tensor + # inputs. Due to failures, creating another test `testInvalidTensorInput` + # which is identical to this one except that the input here is a scalar as + # opposed to a tensor. + def testInvalidPyInput(self): + boxes_np = [[[[0, 0, 1, 1], [0, 0.1, 1, 1.1], [0, -0.1, 1, 0.9], + [0, 10, 1, 11], [0, 10.1, 1, 11.1], [0, 100, 1, 101]]]] + scores_np = [[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]] + max_output_size_per_class = 5 + max_total_size = 2**31 + with self.assertRaisesRegex( + (TypeError, ValueError), + "type int64 that does not match expected type of int32|" + "Tensor conversion requested dtype int32 for Tensor with dtype int64"): + image_ops.combined_non_max_suppression( + boxes=boxes_np, + scores=scores_np, + max_output_size_per_class=max_output_size_per_class, + max_total_size=max_total_size) + + # NOTE(b/142795960): parameterized tests do not work well with tf.tensor + # inputs. Due to failures, creating another this test which is identical to + # `testInvalidPyInput` except that the input is a tensor here as opposed + # to a scalar. + def testInvalidTensorInput(self): + boxes_np = [[[[0, 0, 1, 1], [0, 0.1, 1, 1.1], [0, -0.1, 1, 0.9], + [0, 10, 1, 11], [0, 10.1, 1, 11.1], [0, 100, 1, 101]]]] + scores_np = [[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]] + max_output_size_per_class = 5 + max_total_size = ops.convert_to_tensor(2**31) + with self.assertRaisesRegex( + (TypeError, ValueError), + "type int64 that does not match expected type of int32|" + "Tensor conversion requested dtype int32 for Tensor with dtype int64"): + image_ops.combined_non_max_suppression( + boxes=boxes_np, + scores=scores_np, + max_output_size_per_class=max_output_size_per_class, + max_total_size=max_total_size) |
|||||
158 | Tensorflow/Keras | 94b6db8cc538408cc29d88be13307f9fd8a77120 | C++ | fix | overflow | overflow | Dynamic stitch interleaves the values from the data tensors into a single tensor. | slice_size must not be stored as int for cases of tensors over 2GB. | data processing | data | dynamic_stitch | increase variable precision/change variable type | change variable type | use auto type instead of int | const int slice_size = merged_flat.dimension(1); | const auto slice_size = merged_flat.dimension(1); | ||||
159 | Tensorflow/Keras | 087859fce9409991164f727735743da4cb310fd4 | C++ | fix | overflow | overflow | large input size | other | computational graph | bilinear operation, computational graph optimization | increase variable precision/change variable type | change variable type | use int64 instead of int | const int output_elements = CalculateTensorElementCount( | const int64 output_elements = CalculateTensorElementCount | // Cost with very large tensor. + op_context.op_info.clear_outputs(); + // Number of elements in tensor exceeds 2^32. + constexpr int64 kLargeOutputImageDim = 40000; + DescribeTensor4D(1, kLargeOutputImageDim, kLargeOutputImageDim, + kChannelSize, op_context.op_info.add_outputs()); + const int64 kInterpWeightCost = 12; + // Using half_pixel_centers. + AttrValue half_pixel_centers; + half_pixel_centers.set_b(true); + (*op_context.op_info.mutable_attr())["half_pixel_centers"] = + half_pixel_centers; + + const int64 num_ops = + kInterpWeightCost * (kLargeOutputImageDim * 2) + + kComputeLerpCost * + (kLargeOutputImageDim * kLargeOutputImageDim * kChannelSize); + const int64 expected_compute_time = std::ceil( + num_ops / + estimator_.GetDeviceInfo(op_context.op_info.device()).gigaops); + + const int64 expected_memory_time = + (kImageDim * kImageDim + kLargeOutputImageDim * kLargeOutputImageDim) * + 4; + + const auto cost = PredictCosts(op_context); + EXPECT_EQ(cost.compute_time, Costs::Duration(expected_compute_time)); + EXPECT_EQ(cost.memory_time, Costs::Duration(expected_memory_time)); + EXPECT_EQ(cost.execution_time, + Costs::Duration(expected_memory_time + expected_compute_time)); + EXPECT_FALSE(cost.inaccurate); + EXPECT_EQ(cost.num_ops_with_unknown_shapes, 0); + } |
||||
160 | Tensorflow/Keras | 90e89339a9bf04fb304129a01ca50f25fdde441d | C++ | fix | overflow | overflow | potential overflow in 64-bit MultiplyByQuantizedMultiplier function |
quantization | quantization | quantization | use a different algorithm | use a different algorithm | int32_t reduced_multiplier = (quantized_multiplier + (1 << 15)) >> 16; |
int32_t reduced_multiplier = (quantized_multiplier < 0x7FFF0000) + ? ((quantized_multiplier + (1 << 15)) >> 16) + : 0x7FFF; |
||||||
161 | Tensorflow/Keras | dffb0b56192f4c95fbf563a82742b4a3f4881e05 | C++ | fix | overflow | overflow | A U16 of 46977 multiplied by a U16 of 53826, when evaluated in the evaluator, results in the operands of the multiply getting promoted to the C++ type "int" which is signed. The result of the multiply will overflow a signed int and give a negative result. |
compiler | compiler | compiler, XLA HLO (high level operations) | increase variable precision/change variable type | change variable type | promote both operands to "unsigned int" which will not suffer from any overflow issues |
typename std::enable_if<std::is_integral<T>::value && - std::is_signed<T>::value>::type* = nullptr> -typename std::make_unsigned<T>::type ToArithmeticSafeType(T t) { - return static_cast<typename std::make_unsigned<T>::type>(t); - typename std::enable_if<!std::is_integral<T>::value || - !std::is_signed<T>::value>::type* = nullptr> |
namespace detail { +template <typename T> +using unsigned_promoted_type_t = + std::make_unsigned_t<decltype(std::declval<T>() + std::declval<T>())>; +} + typename std::enable_if<std::is_integral<T>::value>::type* = nullptr> +detail::unsigned_promoted_type_t<T> ToArithmeticSafeType(T t) { + return static_cast<detail::unsigned_promoted_type_t<T>>(t); + typename std::enable_if<!std::is_integral<T>::value>::type* = nullptr> |
|||||
162 | Tensorflow/Keras | 90b80fba1ade0222713b8a33af00858190532075 | C++ | fix | overflow | overflow | compiler message overflow | compiler | compiler | compiler, XLA HLO (high level operations) | limit input range | limit input range | limit max inuts | SummarizeNodeDef(node_def), ".\n"); string SummarizeNodeDef(const NodeDef& node_def) { |
SummarizeNodeDef(node_def, /*max_inputs_in_summary=*/10), ".\n"); +string SummarizeNodeDef(const NodeDef& node_def, int max_inputs_in_summary) { + if (max_inputs_in_summary-- == 0) { + strings::StrAppend(&ret, "..."); + break; +// The parameter `max_inputs_in_summary` specifies how many inputs at most to +// serialize in the output (in order not to get a string which is overly large). +// The value `-1` specifies that all inputs will be shown. +string SummarizeNodeDef(const NodeDef& node_def, + int max_inputs_in_summary = -1); |
|||||
163 | Tensorflow/Keras | 036b75a818493a30cd25caef1761931a3bc2b074 | C++ | fix | overflow | overflow | compiler | compiler | compiler | increase variable precision/change variable type | increase variable precision | increase precision of index from int to int64 | int linear_index = j * vector_size + i; | int64 linear_index = j * vector_size + i; | ||||||
164 | Tensorflow/Keras | 2adf1114d4dc7ca30e5117acd2dc7aeb3279feb7 | C++ | unit test | overflow | overflow | The Android Neural Networks API (NNAPI) is available on all Android devices running Android 8.1 (API level 27) or higher. It provides acceleration for TensorFlow Lite models on Android devices with supported hardware accelerators including: Graphics Processing Unit (GPU) Digital Signal Processor (DSP) Neural Processing Unit (NPU) |
other | other | NNAPI delegate | add overflow check | add overflow check | add overflow check of cpu | // reference CPU path. - Expect(is_accelerator_specified || - (builtin->filter_width * builtin->filter_height <= 256), - NNAPIValidationFailureType::kUnsupportedOperandSize, - "Large filter window would overflow on the reference CPU path", - &val_ctx); |
// quantized reference CPU path. + if (IsQuantized(context->tensors[node->inputs->data[0]].type)) { + Expect(is_accelerator_specified || + (builtin->filter_width * builtin->filter_height <= 256), + NNAPIValidationFailureType::kUnsupportedOperandSize, + "Large filter window would overflow on the reference CPU path", + &val_ctx); + } |
|||||
165 | Tensorflow/Keras | 85f10eb4200b3b3339340943b288da157e9742e7 | C++ | unit test | overflow | overflow | Compilers are producing different code and resulting in bad assumptions. |
precision tests/speed benchmarks | overflow test | overflow test | increase variable precision/change variable type | change variable type | change type of variable y from auto to int64 | - for (auto x : interesting) { - for (auto y : interesting) { - long double dxy = static_cast<long double>(x) * y; - if (dxy > std::numeric_limits<int64>::max()) { - EXPECT_LT(xy, 0); |
bool HasOverflow(int64 x, int64 y) { +#ifdef PLATFORM_WINDOWS + // `long double` on MSVC is 64 bits not 80 bits - use a windows specific API + // for this test. + return ::MultiplyHigh(x, y) != 0; +#else + long double dxy = static_cast<long double>(x) * static_cast<long double>(y); + return dxy > std::numeric_limits<int64>::max(); +#endif +} + for (int64 x : interesting) { + for (int64 y : interesting) { if (HasOverflow(x, y)) { + EXPECT_LT(xy, 0) << x << " " << y; |
|||||
166 | Tensorflow/Keras | 171ba06f5e52078e0aa2112797b5a4227370bbd5 | C++ | unit test | overflow | overflow | Subgraphs are the part of main graph and are themselves computational graphs by nature. | other | computational graph | tensorflow subgraph graph generation | add overflow check | add overflow check | bring back overflow detection for windows | |||||||
167 | Tensorflow/Keras | 2522ce7dd5d28c9733824a66133fc918290e3ed0 | C++ | fix | overflow | overflow | data processing | tensor allocation | tensor allocation | add overflow check | add overflow check | Check for overflow in # of bytes computation of tensor allocation. Check both for product of shape dimensions (# of elements) and number of bytes (elements * sizeof(data_type)). |
no overflow check | TfLiteStatus MultiplyAndCheckOverflow(size_t a, size_t b, size_t* product) { + constexpr size_t overflow_threshold = (8 * sizeof(size_t)) >> 1; + *product = a * b; + // If neither integers have non-zero bits past 32 bits can't overflow. + // Otherwise check using slow devision. + if (__builtin_expect((a | b) >> overflow_threshold != 0, false)) { + if (a != 0 && *product / a != b) return kTfLiteError; + } + return kTfLiteOk; + for (int k = 0; k < dims_size; k++) { + size_t old_count = count; + TF_LITE_ENSURE_MSG( + &context_, + MultiplyAndCheckOverflow(old_count, dims[k], &count) == kTfLiteOk, + "BytesRequired number of elements overflowed.\n"); + } size_t type_size = 0; TF_LITE_ENSURE_OK(&context_, GetSizeOfType(&context_, type, &type_size)); + TF_LITE_ENSURE_MSG( + &context_, MultiplyAndCheckOverflow(type_size, count, bytes) == kTfLiteOk, + "BytesRequired number of bytes overflowed.\n"); |
TEST(BasicInterpreter, TestOverflow) { + TestErrorReporter reporter; + Interpreter interpreter(&reporter); + TfLiteQuantizationParams quantized; + + ASSERT_EQ(interpreter.AddTensors(1), kTfLiteOk); + ASSERT_EQ(interpreter.SetInputs({0}), kTfLiteOk); + ASSERT_EQ(interpreter.SetOutputs({0}), kTfLiteOk); + // Overflow testing is pointer word size dependent. + if (sizeof(size_t) == 8) { + // #bits for bytecount = 30 + 30 + 2 = 62 < 64 + ASSERT_EQ(interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 30, 1 << 30}, quantized), + kTfLiteOk); + // #bits for element count = 30 + 30 + 2 = 62 < 64 (no overflow) + // #bits for byte count = 30 + 30 + 2 + 2 = 64 == 64 (overflow) + ASSERT_NE( + interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 30, 1 << 30, 1 << 2}, quantized), + kTfLiteOk); + EXPECT_THAT( + reporter.error_messages(), + testing::EndsWith("BytesRequired number of bytes overflowed.\n")); + // #bits for element count = 30 + 30 + 2 + 4 = 66 > 64 (overflow). + // #bits for byte count = 30 + 30 + 2 + 4 + 2 = 68 > 64 (overflow). + reporter.Reset(); + ASSERT_NE(interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 30, 1 << 30, 1 << 2, 1 << 4}, + quantized), + kTfLiteOk); + EXPECT_THAT( + reporter.error_messages(), + testing::EndsWith("BytesRequired number of elements overflowed.\n")); + + } else if (sizeof(size_t) == 4) { + // #bits for bytecount = 14 + 14 + 2 = 30 < 32 + ASSERT_EQ(interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 14, 1 << 14}, quantized), + kTfLiteOk); + // #bits for element count = 14 + 14 + 3 = 31 < 32 (no overflow). + // #bits for byte count = 14 + 14 + 3 + 2 = 33 > 32 (overflow). + ASSERT_NE( + interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 14, 1 << 14, 1 << 3}, quantized), + kTfLiteOk); + EXPECT_THAT( + reporter.error_messages(), + testing::EndsWith("BytesRequired number of bytes overflowed.\n")); + // #bits for element count = 14 + 14 + 4 = 32 == 32 (overflow). + // byte count also overflows, but we don't get to that check. + reporter.Reset(); + ASSERT_NE( + interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 14, 1 << 14, 1 << 4}, quantized), + kTfLiteOk); + EXPECT_THAT( + reporter.error_messages(), + testing::EndsWith("BytesRequired number of elements overflowed.\n")); + } else { + // This test failing means that we are using a non 32/64 bit architecture. + ASSERT_TRUE(false); + } +} |
|||||
168 | Tensorflow/Keras | 75e5b5d70b6f33bd41fdf07b844c762b23f99d1b | C++ | fix | overflow | overflow | overflows in accumulation results | tensor math | tensor math | summation | increase variable precision/change variable type | increase variable precision | upcastto an integer type with more bits | N/A | // Upcast small integer types to 32 bit to avoid overflow. + if (dtype == DT_INT8 || dtype == DT_INT16) { + return DT_INT32; + } + if (dtype == DT_UINT8 || dtype == DT_UINT16) { + return DT_UINT32; + } |
N/A | ||||
169 | Tensorflow/Keras | 23fde233bf3210759b5a4453bc39101df9c86d0c | C++ | fix | overflow | overflow | tensor math | statistics | mean | increase variable precision/change variable type | increase variable precision | Perform mean reductions for integer types in 64 bit to mitigate overflow in the sum and/or denominator. I.e.: Upcast int8, int16, int32 into int64 |
define CASTING_SPECIALIZATION(ScalarType, IntermediateType) \ + template <typename Device, typename OUT_T, typename IN_T, \ + typename ReductionAxes> \ + struct ReduceEigenImpl<Device, OUT_T, IN_T, ReductionAxes, \ + functor::MeanReducer<ScalarType>> { \ + void operator()(const Device& d, OUT_T out, IN_T in, \ + const ReductionAxes& reduction_axes, \ + const functor::MeanReducer<ScalarType>& reducer) { \ + static_assert(std::is_same<ScalarType, typename OUT_T::Scalar>::value, \ + ""); \ + Eigen::internal::SumReducer<IntermediateType> sum_reducer; \ + out.device(d) = (in.template cast<IntermediateType>().reduce( \ + reduction_axes, sum_reducer) / \ + static_cast<IntermediateType>(in.size() / out.size())) \ + .template cast<ScalarType>(); \ + } \ + } |
# This tests the issue reported in b/145030710. + @test_util.run_deprecated_v1 + def testSizeOverflowUint8(self): + np_arr = self._makeRandom((2**8,), dtypes.uint8) + self._compareAllAxes(np_arr) + + @test_util.run_deprecated_v1 + def testSizeOverflowInt8(self): + np_arr = self._makeRandom((2**7,), dtypes.int8) + self._compareAllAxes(np_arr) + + @test_util.run_deprecated_v1 + def testSizeOverflowUint16(self): + np_arr = self._makeRandom((2**16,), dtypes.uint16) + self._compareAllAxes(np_arr) + + @test_util.run_deprecated_v1 + def testSizeOverflowInt16(self): + np_arr = self._makeRandom((2**15,), dtypes.int16) + self._compareAllAxes(np_arr) |
||||||
170 | Tensorflow/Keras | 79605069321520bd8af318eef92b71070dcc8961 | C++ | fix | overflow | overflow | strided_slice extracts a strided slice of a tensor (generalized Python array indexing). | strided_slice would overflow for end and start slices larger than int16 | other | other | strided slice kernel | increase variable precision/change variable type | increase variable precision | change the StridedSliceParams start_indices and end_indices from int16 to int32 values | struct StridedSliceParams { int8 start_indices_count; - int16 start_indices[4]; int8 stop_indices_count; - int16 stop_indices[4]; int8 strides_count; - int16 strides[4]; |
struct StridedSliceParams { int8 start_indices_count; + int32 start_indices[4]; int8 stop_indices_count; + int32 stop_indices[4]; int8 strides_count; + int32 strides[4]; |
TEST(StridedSliceOpTest, In1D_Int32End) { + StridedSliceOpModel<> m({32768}, {1}, {1}, {1}, 0, 0, 0, 0, 0); + std::vector<float> values; + for (int i = 0; i < 32768; i++) { + values.push_back(i); + } + m.SetInput(values); + m.SetBegin({0}); + m.SetEnd({32768}); + m.SetStrides({1}); + m.Invoke(); + EXPECT_THAT(m.GetOutputShape(), ElementsAreArray({32768})); + EXPECT_THAT(m.GetOutput(), ElementsAreArray(values)); |
|||
171 | Tensorflow/Keras | eaea3db3be4e27464a0b669bebffe46f2f8b005f | C++ | fix | overflow | overflow | overflow in quantization if there is a mismatch in scale of weights and biases | quantization | quantization | quantization | limit input range | limit input range | Adjusts the scale of the weight tensor when the scale is small enough to lead to overflow due to a mismatch with the scale of the bias values. Checks that the bias is quantized to within the middle half of the allowable bit range determined by the scales of the input and weight tensors If this condition is not satisfied, the scale of the weights is increased in order to prevent overflow. |
TfLiteStatus AdjustWeightsForBiasScale(QuantizationParametersT* quant_params, + const float* bias_data, + const size_t bias_size, + const float input_scale, + ErrorReporter* error_reporter) { + // TODO(dmolitor) Allow adjusting activation scale. + // TODO(dmolitor) Tighten scale adjustment. + // TODO(dmolitor) Test using a separate strategy for scales of 0. + const int32_t kScale = std::numeric_limits<int32_t>::max(); + if (quant_params == nullptr) { + error_reporter->Report("Missing max and min values for weight tensor."); + return kTfLiteError; + } + // channel_dim_size is calculated from min.size() to infer whether + // quantization is per axis + int channel_dim_size = quant_params->min.size(); + if (channel_dim_size == 0) { + error_reporter->Report( + "Missing weight scales. Unable to check compatibility with bias " + "scale."); + return kTfLiteError; + } + + std::vector<float> weight_scales(channel_dim_size); + TF_LITE_ENSURE_STATUS(GetSymmetricScalesFromMaxMin( + quant_params, &weight_scales, error_reporter)); + + // Per channel quantization + if (channel_dim_size > 1) { + for (size_t i = 0; i < channel_dim_size; ++i) { + // Current scale is not compatible with bias. Adjust max/min values. + if (std::abs(bias_data[i]) >= + 0.5 * input_scale * weight_scales[i] * kScale) { + quant_params->max[i] = 2.0 * std::abs(bias_data[i]) / kScale * + (kMaxQuantizedValue / input_scale); + quant_params->min[i] = -quant_params->max[i]; + } + } + // Per layer quantization + } else if (channel_dim_size == 1) { + const auto minmax = std::minmax_element(bias_data, bias_data + bias_size); + const float bias_half_range = + std::max(std::abs(*minmax.first), std::abs(*minmax.second)); + + // Need to adjust weight min/max; not compatible with bias. + if (bias_half_range / kScale >= 0.5 * input_scale * weight_scales[0]) { + quant_params->min[0] = + 2.0 * bias_half_range / kScale * (kMinQuantizedValue / input_scale); + quant_params->max[0] = + 2.0 * bias_half_range / kScale * (kMaxQuantizedValue / input_scale); + } + } + return kTfLiteOk; |
||||||
172 | Tensorflow/Keras | 676bce388aba376a4e6f7307dc92fdc0a8b3af42 | C++ | fix | overflow | overflow | Quantized mean and sum have a risk of overflow | quantization | quantization | quantization | limit input range | limit input range | cast input based on numeric limits | // Convert to float value. - output_data[idx] = - static_cast<T>(std::round(float_mean * scale + bias)) + - output_zero_point; |
float result = + std::min(std::round(float_mean * scale + bias) + output_zero_point, + static_cast<float>(std::numeric_limits<T>::max())); + result = + std::max(result, static_cast<float>(std::numeric_limits<T>::min())); + output_data[idx] = static_cast<T>(result); |
N/A | ||||
173 | Tensorflow/Keras | e08474a981b87a8c4fdc9d9d08765727fe8d629e | C++ | fix | overflow | overflow | compiler | compiler | compiler, variable accessor | increase variable precision/change variable type | increase variable precision | Change shared variables to high precision | ||||||||
174 | Tensorflow/Keras | c782a538b0b90d93c6070ac177cb1f542272bcce | C++ | fix | overflow | overflow | overflowing of integer "+" and "-" operations | CNN operations | convolution | convolution transpose | rewrite math formula | rewrite math formula | - int i = y * $kernel_size.x$ + x; - ivec2 idx = gid.xy + ivec2(x, y) - $padding$; |
int i = int(float(y * $kernel_size.x$) + float(x)); + ivec2 idx = ivec2(vec2(gid.xy + ivec2(x, y)) - vec2($padding$)); |
||||||
175 | Tensorflow/Keras | ea316ec1827bacae811858a7f681dfac47ef7f47 | C++ | fix | overflow | overflow | signed overflow | compiler | compiler | compiler, dot interpreter | increase variable precision/change variable type | change variable type | change type to unsigned | - static_cast<ElementwiseT>(lhs_literal.Get<ReturnT>(lhs_index)) * - static_cast<ElementwiseT>(rhs_literal.Get<ReturnT>(rhs_index)); |
ElementwiseT lhs_val(lhs_literal.Get<ReturnT>(lhs_index)); + ElementwiseT rhs_val(rhs_literal.Get<ReturnT>(rhs_index)); ToArithmeticSafeType(lhs_val) * ToArithmeticSafeType(rhs_val); |
|||||
176 | Tensorflow/Keras | 09b8ed34f47dbd6921304f2d4ceb3669c1e089e6 | Python | fix | overflow | overflow | int32 overflow | other | other | flatten layer | increase variable precision/change variable type | increase variable precision | increase precision of variable shape to int64 if neccessary, otherwise keep it as int32 | input_shape = inputs.shape + if input_shape[1:].is_fully_defined(): + flattened_dim = tensor_shape.dimension_value( + np.prod(input_shape[1:], dtype=int)) + # Temporary fix for integer overflow issue. + if flattened_dim > np.iinfo(np.int32).max: + shape_dtype = dtypes.int64 + else: + shape_dtype = dtypes.int32 + outputs = array_ops.reshape( + inputs, constant_op.constant((-1, flattened_dim), shape_dtype)) |
def testFlattenLargeDim(self): + x = array_ops.placeholder(shape=(None, 21316, 21316, 80), dtype='float32') + y = core_layers.Flatten()(x) + self.assertEqual(y.shape.as_list(), [None, 21316 * 21316 * 80]) |
|||||
177 | Tensorflow/Keras | dbcb2a5470e40974924cebd0f74d7f117b21bf8e | C++ | fix | overflow | overflow | compiler | compiler | compiler, bit cast operation | increase variable precision/change variable type | increase variable precision | increase precision of an integer to int64 | auto output_bit_width_mask = (1 << output_bit_width) - 1; | auto output_bit_width_mask = (int64(1) << output_bit_width) - 1; |
||||||
178 | Tensorflow/Keras | 0d6095963d907e0de1d635842d8ed80759a436ba | C++ | fix | overflow | overflow | data processing | memory allocator | ruy allocator, size | increase variable precision/change variable type | change variable type | change from std::size_t to std::ptrdiff_t | ||||||||
179 | Tensorflow/Keras | 5b4fe5470852d1aea737b194e03727cdedddebca | C++ | fix | underflow | underflow | exponent smaller than -31 causes underflow | quantization | quantization | quantization | rewrite math formula | rewrite math formula | For exponents smaller than -31, set shift to zero | void GuardedQuantizeMultiplier(double effective_output_scale, - int32_t* significand, int* shift) { - QuantizeMultiplier(effective_output_scale, significand, shift); - // Additional guard to make sure RoundingDivideByPOT does not fail. - if (*shift < -31) { - // If shift is less than -31, RoundingDivideByPOT fails. This happens when - // min and max are close and small. For this particular case, both - // significand and shift are set to zero. - *significand = 0; - *shift = 0; - } -} |
void QuantizeMultiplier(double double_multiplier, int32_t* quantized_multiplier, ++*shift; } if (*shift < -31) { + *shift = 0; + q_fixed = 0; + } QuantizeMultiplier(effective_output_scale, &significand, &shift); |
TEST(QuantizationUtilTest, QuantizeMultiplierUnderflow) { + auto quantize = [](double d) { + int32_t q; + int s; + QuantizeMultiplier(d, &q, &s); + return std::pair<int32_t, int>{q, s}; + }; + + EXPECT_THAT(quantize(std::ldexp(1.0f, -31)), Pair(1073741824, -30)); + EXPECT_THAT(quantize(std::ldexp(1.0f, -32)), Pair(1073741824, -31)); + EXPECT_THAT(quantize(std::ldexp(0.99f, -32)), Pair(0, 0)); + EXPECT_THAT(quantize(std::ldexp(1.0f, -33)), Pair(0, 0)); +} |
||||
180 | Tensorflow/Keras | 3af3959377d54414f480d617402274f7e9440316 | C++ | fix | overflow | overflow | Using sqrt(a^2 + b^2) | tensor math | tensor math | absolute value of a complex number | rewrite math formula | rewrite math formula | use sqrt(a^2 + b^2) = sqrt(a^2 * (1 + b^2/a^2)) = |a| * sqrt(1 + (b/a)^2) With the assumption that |a| >= |b| |
case HloOpcode::kAbs: { - auto sum_sq = FAdd( - FMul(EmitExtractReal(operand_value), EmitExtractReal(operand_value)), - FMul(EmitExtractImag(operand_value), EmitExtractImag(operand_value))); - return llvm_ir::EmitCallToIntrinsic(llvm::Intrinsic::sqrt, {sum_sq}, |
StatusOr<llvm::Value*> ElementalIrEmitter::EmitComplexAbs( + PrimitiveType prim_type, llvm::Value* operand_value) { + auto real = EmitExtractReal(operand_value); + auto imag = EmitExtractImag(operand_value); + auto abs_real = llvm_ir::EmitCallToIntrinsic(llvm::Intrinsic::fabs, {real}, + {real->getType()}, b_); + auto abs_imag = llvm_ir::EmitCallToIntrinsic(llvm::Intrinsic::fabs, {imag}, + {imag->getType()}, b_); + auto max = EmitFloatMax(abs_real, abs_imag); + auto min = EmitFloatMin(abs_real, abs_imag); + + auto div = FDiv(min, max); + auto div_sq = FMul(div, div); + auto one = llvm::ConstantFP::get(max->getType(), 1); + TF_ASSIGN_OR_RETURN(auto sqrt, EmitSqrt(prim_type, FAdd(one, div_sq))); + + auto zero = llvm::ConstantFP::get(max->getType(), 0); + return Select(FCmpOEQ(max, zero), zero, FMul(max, sqrt)); |
|||||
181 | Tensorflow/Keras | 840f25bd4623e5a9aedcbe6163332f51ee303784 | C++ | fix | overflow | overflow | signed integer overflow in HandleCopies when batch_size * indices_size * slice_size is larger than int32 | data processing | parallelism | kernels, gather | increase variable precision/change variable type | increase variable precision | use int64 instead of int32 for large values | bool use_large = + batch_size * indices_size * slice_size > + std::numeric_limits<int32>::max()); |
||||||
182 | Tensorflow/Keras | 8211365f9e8aed8cec7b63d7eb992ab104422f8c | C++, Python | fix | overflow | overflow | build error on Windows caused by potential int32 overflow | data processing | data | shard size | increase variable precision/change variable type | increase variable precision | increase precision from int32 to int64 to calculate the default shard size | self._shard_size_bytes = ( - shard_size_bytes - if shard_size_bytes is not None else 10 * 1024 * 1024 * 1024) self._pending_snapshot_expiry_seconds = ( pending_snapshot_expiry_seconds - if pending_snapshot_expiry_seconds is not None else 86400) |
// Defaults to 10 GiB per shard. +const int64 kDefaultShardSizeBytes = 10L * 1024 * 1024 * 1024; if (shard_size_bytes_ == -1) shard_size_bytes_ = kDefaultShardSizeBytes; + + // Default to 1 day expiry for snapshots. + if (pending_snapshot_expiry_seconds_ == -1) { + pending_snapshot_expiry_seconds_ = 86400; + } shard_size_bytes if shard_size_bytes is not None else -1) |
|||||
183 | Tensorflow/Keras | 8ac1075eac1ab9072e29c025348f749b43f251cf | C++ | fix | overflow | overflow | optimizers | optimizers | experimental optimizer | limit input range | limit input range | add upper bound scaling to prevent overflow | element = T(UniformDistribution(RandomType(0), RandomType(1), &gen)); | auto upper_bound = + RandomType(std::is_same<T, Eigen::half>::value ? 0.1 : 1.0); + element = T(UniformDistribution(RandomType(0), upper_bound, &gen)); |
||||||
184 | Tensorflow/Keras | 737600454df83be02fac46e48b093a8892c7241a | C++ | unit test | overflow | overflow | linear algebra | linear algebra | matrix multiply | limit input range | limit input range | Avoid the corner case where both lhs and rhs zero_point's are the lowest representable value in their respective quantized type. E.g. when both LHS and RHS are uint8 with zero_point=0. | if (!use_golden && !std::is_floating_point<LhsScalar>::value) { - lhs_params.zero_point = random_engine() % 8; if (!use_golden && !std::is_floating_point<RhsScalar>::value) { - rhs_params.zero_point = random_engine() % 8; |
if (!std::is_floating_point<LhsScalar>::value) { + lhs_params.zero_point = 1; + if (!use_golden) { + lhs_params.zero_point += random_engine() % 8; + } if (!std::is_floating_point<RhsScalar>::value) { + rhs_params.zero_point = 1; + if (!use_golden) { + rhs_params.zero_point += random_engine() % 8; + } |
||||||
185 | Tensorflow/Keras | c38b41d7c813e0dc26fa99cf6495ec474a595542 | C++ | fix | overflow | overflow | possible float-to-integer-cast overflow | precision tests/speed benchmarks | timing | timing | increase variable precision/change variable type | change variable type | cahnge variable for holding processing time from int64 to double | const int64 processing_time = TotalProcessingTime(snapshot); const int64 output_time = OutputTime(snapshot); int64 best_delta = -1; int64 new_output_time = OutputTime(snapshot); int64 delta = output_time - new_output_time; int64 Model::OutputTime(std::shared_ptr<Node> node) |
const double processing_time = TotalProcessingTime(snapshot); const double output_time = OutputTime(snapshot); double best_delta = -1.0L; double new_output_time = OutputTime(snapshot); double delta = output_time - new_output_time; double Model::OutputTime(std::shared_ptr<Node> node) { |
|||||
186 | Tensorflow/Keras | 52a6cfddef9b51b608b4a554b77a10e1522d56ec | C++ | fix | overflow | overflow | overflow of variable size | data processing | parallelism | segmented reduction—a parallel reduction over many irregular-length segments. | increase variable precision/change variable type | increase variable precision | change int to int64 for number of threads, block size, block numbers, compute cycles, compute bytes | const int num_threads = cpu_device.numThreads(); const int min_block_size = 64; - const int max_block_num = std::min(N / min_block_size + 1, num_reductions); - int block_num = std::min(max_block_num, num_threads); - const int block_size = N / block_num; const int compute_cycles = 5 * (N - num_reductions) * inner_dim; - const int output_bytes = num_reductions * inner_dim * sizeof(T); |
const int64 num_threads = cpu_device.numThreads(); const int64 min_block_size = 64; + int64 block_num = std::min(num_reductions, num_threads); + int64 block_size = (N - 1) / block_num + 1; + if (block_size < min_block_size) { + block_size = min_block_size; + block_num = (N - 1) / min_block_size + 1; const int64 compute_cycles = 5 * (N - num_reductions) * inner_dim; + const int64 output_bytes = num_reductions * inner_dim * sizeof(T); |
|||||
187 | Tensorflow/Keras | aa4765a1417950cf2c29afd0172aebdd31b0725f | C++ | fix | overflow | overflow | cast overflow undefined behavior | tensor math | tensor math | absolute value of a complex number | increase variable precision/change variable type | increase variable precision | Change this function that outputs absolute value to return a double instead of a float to avoid cast overfloat for inputs with types double and complex128. |
float FpAbsoluteValue(NativeT value) | double FpAbsoluteValue(NativeT value) | TEST(LiteralTestUtilTest, ExpectNearDoubleOutsideFloatValueRange) { + auto two_times_float_max = + LiteralUtil::CreateR0<double>(2.0 * std::numeric_limits<float>::max()); + ErrorSpec error(0.001); + EXPECT_TRUE( + LiteralTestUtil::Near(two_times_float_max, two_times_float_max, error)); |
||||
188 | Tensorflow/Keras | d0136d4affebd14fee59ba1865d5f1c8fa64251a | C++ | fix | overflow | overflow | TensorFlow BFC Allocator is a memory allocator that implements a 'best-fit with coalescing' algorithm. | index integer overflow | data processing | memory allocator | TensorFlow BFC Allocator | increase variable precision/change variable type | change variable type | change type of an index from int to size_t | int IndexFor(const void* p) const return static_cast<int>(((p_int - base_int) >> kMinAllocationBits)); |
size_t IndexFor(const void* p) const return static_cast<size_t>(((p_int - base_int) >> kMinAllocationBits)); |
||||
189 | Tensorflow/Keras | f9ac078ebd0d05b64691e6718d404ee801f80c67 | C++ | fix | overflow | overflow | conversion to float32 results in overflow | other | other | number casting | add overflow check | add overflow check | return error if overflow and return null pointer if infinity | double as_double = PyFloat_AsDouble(v); + // Handle infinity. + if (as_double == std::numeric_limits<double>::infinity()) { + *out = std::numeric_limits<T>::infinity(); + return nullptr; + } else if (as_double == -1 * std::numeric_limits<double>::infinity()) { + *out = -1 * std::numeric_limits<T>::infinity(); + return nullptr; + } + // Check for overflow. + if (as_double > std::numeric_limits<T>::max() || + as_double < std::numeric_limits<T>::lowest()) { + return ErrorOutOfRangeDouble; + } + *out = static_cast<T>(as_double); |
||||||
190 | Tensorflow/Keras | c8e8f35f3e39b36d105eb7e43321a9da1362f242 | C++ | fix | overflow | overflow | integer overflow | precision tests/speed benchmarks | timing | timing | increase variable precision/change variable type | increase variable precision | use unsigned long long int | event->set_timestamp_ps(node.all_start_micros() * 1000000); - event->set_duration_ps(node.all_end_rel_micros() * 1000000); |
static constexpr uint64 kMicrosToPicos = 1000ULL * 1000ULL; const uint64 profile_start_time_micros event->set_timestamp_ps( + (node.all_start_micros() - profile_start_time_micros) * + EnvTime::kMicrosToPicos); + event->set_duration_ps(node.all_end_rel_micros() * + EnvTime::kMicrosToPicos); |
|||||
191 | Tensorflow/Keras | f1d0c84f699624382c8d66e2ea10205ac0207868 | C++ | disable test | overflow | overflow | precision tests/speed benchmarks | timing | timing | disable test/warning | disable overflow test | Skip overflow testing when running with address sanitizer. | ||||||||
192 | Tensorflow/Keras | 84337310517914ca4b4d6eb35295a65758bc6d75 | C++ | unit test | overflow | overflow | AveragePool uses a uint16 accumulator which causes it to overflow for large images |
CNN operations | pooling layer | average pooling | add overflow check | add overflow check | add overflow test | N/A | // Send in a white image, expect a white pixel. +TEST(QuantizedPoolingOpTest, AveragePoolImageSize16) { + int image_size = 16; + QuantizedPoolingOpModel m( + BuiltinOperator_AVERAGE_POOL_2D, + /*input=*/{TensorType_UINT8, {1, image_size, image_size, 1}, 0, 16}, + /*filter_width=*/image_size, + /*filter_height=*/image_size, + /*output=*/{TensorType_UINT8, {}, 0, 16}); + + std::vector<float> input(image_size * image_size, 16.f); + m.SetInput(input); + m.Invoke(); + + EXPECT_THAT(m.GetOutput(), ::testing::ElementsAre(255)); + EXPECT_THAT(m.GetDequantizedOutput(), ElementsAreArray(ArrayFloatNear({16}))); +} + +// Send in a white image, expect something other than a white pixel, due to +// overflow. +TEST(QuantizedPoolingOpTest, AveragePoolImageSize17) { + int image_size = 17; + QuantizedPoolingOpModel m( + BuiltinOperator_AVERAGE_POOL_2D, + /*input=*/{TensorType_UINT8, {1, image_size, image_size, 1}, 0, 16}, + /*filter_width=*/image_size, + /*filter_height=*/image_size, + /*output=*/{TensorType_UINT8, {}, 0, 16}); + + std::vector<float> input(image_size * image_size, 16.f); + m.SetInput(input); + m.Invoke(); + + // Ordinarily we would see '255' here. However, the optimized version of + // AveragePool uses a uint16 accumulator which causes it to overflow for + // images this large. + EXPECT_THAT(m.GetOutput(), ::testing::ElementsAre(28)); |
N/A | ||||
193 | Tensorflow/Keras | 434dbe38970ffc90a5b546780be702e0b5de9a0c | C++ | fix | overflow | undefined behavior | overflow | undefined behavior caused by integer overflow in custom float comparison | compiler | compiler | casting | limit input range | limit input range | consider numeric limits | return std::numeric_limits<CastType>::max() - casted_value; | return static_cast<UnsignedCastType>(std::numeric_limits<CastType>::max()) - + casted_value; |
||||
194 | Tensorflow/Keras | fc44600e5c3ccf1de1e3d4792a00d3578311d3f6 | Cuda | fix | overflow | overflow | index integer overflow | linear algebra | linear algebra | row reduce | rewrite math formula | rewrite math formula | rewrite formula | const int row = (blockIdx.x * blockDim.x + threadIdx.x) / 32; std::size_t temp_storage_bytes = 0; - Tensor temp_storage; - // written as a loop because it reduces clutter - // first pass allocates memory, second launches kernel(s) - for (int i = 0; i < 2; ++i) { - auto success = cub::DeviceReduce::Reduce( - i == 0 ? nullptr : temp_storage.flat<int8_t>().data(), - temp_storage_bytes, in, out, in_size, op, init, cu_stream); |
assert(blockDim.x % 32 == 0); + int warps_per_block = blockDim.x / 32; + int warp_index = threadIdx.x / 32; + const int row = blockIdx.x * warps_per_block + warp_index; size_t temp_storage_bytes = 0; + auto reduce = [&](void* temp_storage_ptr) { + auto success = + cub::DeviceReduce::Reduce(temp_storage_ptr, temp_storage_bytes, in, out, + in_size, op, init, cu_stream); |
|||||
195 | Tensorflow/Keras | e66aea59e0367618f924ffe3bc3b1140be8eaf45 | C++ | fix | underflow | underflow | underflow if data empty | data processing | data | tf.data / Cloud Bigtable | rewrite math formula | rewrite math formula | change order of operations | if (index_ > keys_.size() - 2) { |
if (index_ + 2 > keys_.size()) { | |||||
196 | Tensorflow/Keras | 880390941ce6430996c8f842540f73b53f3d1d8e | Python | fix | overflow | overflow | int32 overflow | data processing | parallelism | parallelism | increase variable precision/change variable type | increase variable precision | use int64 number of segments to guard against int32 overflow | num_segments *= n | num_segments = math_ops.cast(num_segments, dtypes.int64) * math_ops.cast( + n, dtypes.int64) |
|||||
197 | Tensorflow/Keras | f0d7172a30954b6696bdf2f40a5be11e7fdeb39c | C++ | fix | overflow | overflow | int overflow | compiler | compiler | compiler, shape inference | add overflow check | add overflow check | return invalid argument if number of features is not positive | if (feature_group_count <= 0) { + return InvalidArgument( + "feature_group_count must be a positive number, got %d", + feature_group_count); |
||||||
198 | Tensorflow/Keras | 63bac283d12899a2d769a768729942c4f64436ea | C++ | fix | overflow | undefined behavior | overflow | undefined behavior due to signed integer overflow | data processing | image processing | bmp image decoding | rewrite math formula | rewrite math formula | rewrite formula | const int row_size = (8 * channels_ * width + 31) / 32 * 4; | const int row_size = (channels_ * width + 3) / 4 * 4; | ||||
199 | Tensorflow/Keras | 4f7a169a7eb97ea4819217f14705d6c2bd125355 | C++ | fix | overflow | overflow | Need to handle overflow in devision and remainder | compiler | compiler | compiler, elemental emiter, division, remainder | add overflow check | add overflow check | Define integer division overflow for CPU/GPU | N/A | X / 0 == -1 X % 0 == X INT_SMIN / -1 = INT_SMIN INT_SMIN % -1 = 0 |
|||||
200 | Tensorflow/Keras | d7ebc1f4ca2c677710c5257d30c757f0f8b604c6 | Python | fix | overflow | overflow | overflow in flops calculations in nn_ops.py | CNN operations | CNN | flops calculation, product | increase variable precision/change variable type | increase variable precision | use int64 for product calculation | output_count = np.prod(output_shape.as_list()) |
output_count = np.prod(output_shape.as_list(), dtype=np.int64) | |||||
201 | Tensorflow/Keras | e7674c09a151cac07bae43f6fe8551e8fec6dfe0 | C++ | fix | overflow | overflow | array index overflow in TransformFilter functor | CNN operations | convolution | indexing, 2D convolution | limit input range | limit input range | subtract 2 from number of dimentions to iterate over | for (int i = 0; i < NDIMS; ++i) { // spatial dimensions | for (int i = 0; i < NDIMS - 2; ++i) { // spatial dimensions | |||||
202 | Tensorflow/Keras | aec5a0191e21ce022f47d743a4954e13f710cd8f | C++ | fix | overflow | overflow | very large and branchy models, where the number of paths is exponential to the number of nodes can overflow - specifically an overflow in hlo_scheduling, when compiling AutoML models | compiler | compiler | compiler, HLO (high level operations) | limit input range | limit input range | set min and max for total number of HLOs | N/A | int64 total_hlos = computation.parent()->NumUniqueInstructionIds(); extra_users[hlo] = std::min(extra_users[hlo], total_hlos); |
|||||
203 | Tensorflow/Keras | 503b7c11b44ee8b238946b345efea503058652c0 | Python | disable test | overflow | overflow | SinhArcsinh: Y = g(X) = Sinh( (Arcsinh(X) + skewness) * tailweight ) * multiplier. | overflow test fails | other | transformations | square, Bijective transformations | disable test/warning | disable overflow test | Skipped the check that fails due to overflow error | # Do the numpy calculation in float128 to avoid inf/nan. - y_float128 = np.float128(y) - self.assertAllClose( - np.log(np.cosh( - np.arcsinh(y_float128) / tailweight - skewness) / np.sqrt( - y_float128**2 + 1)) - - np.log(tailweight), - bijector.inverse_log_det_jacobian(y, event_ndims=0).eval(), - rtol=1e-4, - atol=0.) |
# On IBM PPC systems, longdouble (np.float128) is same as double except that it can have more precision. + # Type double being of 8 bytes, can't hold square of max of float64 (which is also 8 bytes) and + # below test fails due to overflow error giving inf. So this check avoids that error by skipping square + # calculation and corresponding assert. + + if np.amax(y) <= np.sqrt(np.finfo(np.float128).max) and \ + np.fabs(np.amin(y)) <= np.sqrt(np.fabs(np.finfo(np.float128).min)): + + # Do the numpy calculation in float128 to avoid inf/nan. + y_float128 = np.float128(y) + self.assertAllClose( + np.log(np.cosh( + np.arcsinh(y_float128) / tailweight - skewness) / np.sqrt( + y_float128**2 + 1)) - + np.log(tailweight), + bijector.inverse_log_det_jacobian(y, event_ndims=0).eval(), + rtol=1e-4, + atol=0.) |
||||
204 | Tensorflow/Keras | f5dbc1e16622f433f41f195bb33f56d674a004ce | C++ | fix | overflow | overflow | TensorFlow Lite Converter converts TensorFlow graphs into TensorFlow Lite graphs | overflow in shape calculation TensorFlow's shapes use int64s, while TOCO uses ints. |
tensor math | tensor shape | shape, Tensorflow Lite Converter (TOCO) | add overflow check | add overflow check | |||||||
205 | Tensorflow/Keras | 9f312f32091534bfc115212d2ec7c838180df663 | C++ | fix | overflow | overflow | overflow due to large values | other | random number generator | random tensor generation | limit input range | limit input range | Updating Generate Random Tensor to generate tensors whose values are small and do not cause overflow for arithmetic operations. | tensor.flat<T>() = tensor.flat<T>().random(); | for (auto i = 0; i < tensor.NumElements(); i++) + tensor.flat<T>()(i) = i + random::New64() % 10; |
|||||
206 | Tensorflow/Keras | 6a7779f3384e48012d3e27ae0f48d410f5174d06 | C++ | fix | overflow | overflow | undefined signed integer overflow | statistical distributions | statistical distributions | random uniform distribution | limit input range | limit input range | impose coditions on random number generation to prevent overflow | result[i] = lo_ + static_cast<int32>(sample[i] % range_); result[i] = lo_ + static_cast<int64>(bits % range_); |
template <typename Int> +PHILOX_DEVICE_INLINE Int SignedAdd(Int a, + typename std::make_unsigned<Int>::type b) { + auto b_div_2 = b >> 1; + return a + static_cast<Int>(b_div_2) + static_cast<Int>(b - b_div_2); result[i] = SignedAdd(lo_, sample[i] % range_); result[i] = SignedAdd(lo_, bits % range_); |
|||||
207 | Tensorflow/Keras | d107fee1e4a9a4462f01564798d345802acc2aef | C++ | fix | overflow | overflow | other | other | I/O | limit input range | limit input range | consider numeric limits | N/A | if (kBlockTrailerSize > std::numeric_limits<size_t>::max() - n) { + return errors::DataLoss("handle.size() too big"); + } + N249 |
||||||
208 | Tensorflow/Keras | 665a4bf664546224c65eeb5a0a52d80e48e2f3e1 | C++ | fix | overflow | overflow | int64 overflow and low accuracy | compiler | compiler | compiler, HLO (high level operations), size | use a different algorithm | use a different algorithm | The new implementation computes the min of the previous overestimate and the sum of all HLO's before-and-including the current HLO in a topological sort of the graph. |
|||||||
209 | Tensorflow/Keras | 11f1e50886f91ce2caa6e53b0bc9a1e82abdda8e | Python | unit test | overflow | overflow | exp() test overflowing | tensor math | tensor math | exponential | limit input range | limit input range | Keep the results below 2^31 in exp(), consider min and max | create_tensor_data(parameters["input_dtype"], parameters["input_shape"]) | create_tensor_data(parameters["input_dtype"], parameters["input_shape"], + min_value=-100, max_value=9) |
|||||
210 | Tensorflow/Keras | 49f73c55d56edffebde4bca4a407ad69c1cae433 | C++ | fix | overflow | overflow | integer overflow | data processing | image processing | bmp image decoding | increase variable precision/change variable type | increase variable precision | Fix integer overflow in BMP decoder by making the checks in DecodeBmp more stringent. Total possible pixel bytes must be less than 2^30. Also, increase orecision of image size from int to int64. Add fuzzer to improve the robustness of the decoder in the future. |
const int last_pixel_offset = - header_size + (abs(height) - 1) * row_size + (width - 1) * channels_; - const int expected_file_size = last_pixel_offset + channels_; |
OP_REQUIRES(context, width > 0 && header_size >= 0, + errors::InvalidArgument("Width must be positive")); + OP_REQUIRES(context, header_size >= 0, + errors::InvalidArgument("header size must be nonnegative")); + + // The real requirement is < 2^31 minus some headers and channel data, + // so rounding down to something that's still ridiculously big. + OP_REQUIRES( + context, + (static_cast<int64>(width) * std::abs(static_cast<int64>(height))) < + static_cast<int64>(std::numeric_limits<int32_t>::max() / 8), + errors::InvalidArgument( + "Total possible pixel bytes must be less than 2^30")); + + const int32 abs_height = abs(height); const int64 last_pixel_offset = static_cast<int64>(header_size) + + (abs_height - 1) * row_size + + (width - 1) * channels_; const int64 expected_file_size = last_pixel_offset + channels_; |
|||||
211 | Tensorflow/Keras | 7f88363810e77a39db919fb4000583ad0138e53c | C++ | fix | overflow | overflow | integer overflow | other | computational graph | shape size propagation in a tf graph | increase variable precision/change variable type | increase variable precision | increase precision from int to int64 for max loops | const int num_loops = new_shapes->size(); - const int max_loop_length = item_.graph.node_size(); - const int max_rank = 4; - const int max_loop_iterations = - max_rank * max_loop_length * std::max(1, num_loops * num_loops); - const int num_queues = resources.size(); - const int max_resource_iterations = num_queues * num_queues * max_rank; - - int num_resource_iterations = 0; int num_loop_iterations = 0; |
const int64 num_loops = new_shapes->size(); + const int64 max_loop_length = item_.graph.node_size(); + const int64 max_rank = 4; + const int64 max_loop_iterations = + max_rank * max_loop_length * std::max<int64>(1, num_loops * num_loops); + const int64 num_queues = resources.size(); + const int64 max_resource_iterations = num_queues * num_queues * max_rank; + + int64 num_resource_iterations = 0; do { + int64 num_loop_iterations = 0; |
|||||
212 | Tensorflow/Keras | 192f1c24ec6692342391c03bb620f5de1af9de3b | C++ | fix | overflow | overflow | integer overflow | data processing | parallelism | parallelism | rewrite math formula | rewrite math formula | rewrite formula for calculating maximum number of elements | - input_shape.num_elements() >= - std::max(num_threads, num_split) * 4096 && - input_shape.num_elements() < num_split * 180 * 1024); - num_split, kint64max, range_output_func); - input_shape.num_elements() >= - std::max(num_threads, num_split) * 4096 && - input_shape.num_elements() < num_split * 180 * 1024); - num_split, kint64max, range_output_func); |
const auto input_element_count = input_shape.num_elements(); input_element_count >= std::max(num_threads, num_split) * 4096 && + input_element_count < num_split * 180 * 1024); num_split, input_element_count / num_split, range_output_func); input_element_count >= std::max(num_threads, num_split) * 4096 && + input_element_count < num_split * 180 * 1024); num_split, input_element_count / num_split, range_output_func); |
|||||
213 | Tensorflow/Keras | b1c095a28a7aa9bbee4af4d9a7e9d0c60567765b | Python | fix | underflow | underflow | underflow in log probability | statistical distributions | statistical distributions | multinomial distribution, log probabilty | use a different algorithm | use a different algorithm | use log softmax and logits instead of log and probabilities | return math_ops.reduce_sum(counts * math_ops.log(self.probs), -1) |
return math_ops.reduce_sum(counts * nn_ops.log_softmax(self.logits), -1) | def testPmfUnderflow(self): + logits = np.array([[-200, 0]], dtype=np.float32) + with self.test_session(): + dist = multinomial.Multinomial(total_count=1., logits=logits) + lp = dist.log_prob([1., 0.]).eval()[0] + self.assertAllClose(-200, lp, atol=0, rtol=1e-6) |
||||
214 | Tensorflow/Keras | 74137f994faad09593ae2daad6251a4ccf72f558 | C++ | fix | overflow | overflow | When a node name has a long numeric suffix, e.g., "foo/y_0/gradient_debug_09684b60f2184c67b744721915034528" (as has happened with tfdbg GradientsDebugger), the parsing algorithm in ParseTensorName() may experience signed int overflow. |
other | other | tensor name parser | increase variable precision/change variable type | change variable type | use unsigned int instead of signed int | - int index = 0; - int mul = 1; |
unsigned int index = 0; + unsigned int mul = 1; |
|||||
215 | Tensorflow/Keras | 793fa4e91d3cae77565f753c2b8d769e1a3928f8 | Python | fix | overflow | overflow | vimco package proves a Bayesian variable selection method for GWAS data with multiple traits. Unlike in BVSR where each trait is analyzed seperately, vimco performs a joint analysis for the multiple traits, while accounting for correlation among the multiple traits. Csiszar f-Divergence generalized VIMCO objective | overflow issue in Csiszar-VIMCO | other | probability | gradient estimator, csiszar divergence | rewrite math formula | rewrite math formula | rewrite formula for log sum - subtract maximum from input | log_sum_u = math_ops.reduce_logsumexp(logu, axis=0) return log_sum_u - log_n, log_soosum_u - log_n |
log_max_u = math_ops.reduce_max(logu, axis=0) + log_sum_u_minus_log_max_u = math_ops.reduce_logsumexp( + logu - log_max_u, axis=0) is_positive_and_largest = math_ops.logical_and( + logu > 0., + math_ops.equal(logu, log_max_u[array_ops.newaxis, ...])) + log_lomsum_u = math_ops.reduce_logsumexp( + array_ops.where(is_positive_and_largest, + array_ops.fill(array_ops.shape(logu), -inf), + logu), + axis=0, keep_dims=True) + log_lomsum_u = array_ops.tile( + log_lomsum_u, + multiples=1 + array_ops.pad([n-1], [[0, array_ops.rank(logu)-1]])) + + d_not_ok_result = array_ops.where( + is_positive_and_largest, + log_lomsum_u, + array_ops.fill(array_ops.shape(d), -inf)) + + log_loosum_u = array_ops.where(d_ok, d_ok_result, d_not_ok_result) log_avg_u = log_sum_u_minus_log_max_u + log_max_u - log_n + log_sooavg_u = log_soosum_u - log_n + + log_avg_u.set_shape(logu.shape.with_rank_at_least(1)[1:]) + log_sooavg_u.set_shape(logu.shape) + + return log_avg_u, log_sooavg_u |
||||
216 | Tensorflow/Keras | d906c963269dd1522c7693c8f944e6a846b86221 | C++ | unit test | overflow | overflow | signed integer overflows detected with -fsanitize=signed-integer-overflow | compiler | compiler | compiler, shape inference | increase variable precision/change variable type | change variable type, add overflow check | use unsigned int instead of signed int to prevent undefined behavior and report error if overflow | const int64 b = a + 1; const int64 sum = first_value + second_value; - int64 result = 0; [](int i) { return static_cast<float>(i * i * i); }); |
const int64 b = a - 1; const int64 sum = static_cast<uint64>(first_value) + second_value; uint64 result = 0; [](int i) { return static_cast<float>(i) * i * i; }); |
|||||
217 | Tensorflow/Keras | 931fd84bb72df0500f512d5d92ec0bef2ea461be | Python | fix | overflow | overflow | numpy.prod overflow on windows | gradients/derivatives | gradients | shape, gradient, tensor | increase variable precision/change variable type | increase variable precision | perform computations in int64 instead of int32 and then convert result to int32 | shape_size = np.prod(shape) num_elements = np.prod(shape) params_shape = array_ops.shape(params) |
shape_size = np.prod(shape, dtype=np.int64) num_elements = np.prod(shape, dtype=np.int64) params_shape = array_ops.shape(params, out_type=ops.dtypes.int64) params_shape = math_ops.to_int32(params_shape) |
|||||
218 | Tensorflow/Keras | e8ee5286a686c6fc3057ba7cf9ba9ef7003789a6 | C++ | fix | overflow | overflow | data processing | data | tensor shape, multipy, size | limit input range | limit input range, add overflow check | Remove 2**40 size limit on TensorShape, use std::numerica_limits instead. the previous TensorShape code did not check for overflow when multiplying | ||||||||
219 | Tensorflow/Keras | 3c9ba5673cf560ded0739530b673ab0a05d43630 | C++ | unit test | overflow | overflow | integer overflow, undefined behavior, square | other | random number generator | pseudo-random number generator | increase variable precision/change variable type | increase variable precision | cast from int32 to int64 | sum += Square(counts[i] - expected_count); | sum += Square(static_cast<int64>(counts[i] - expected_count)); | |||||
220 | Tensorflow/Keras | 60e7360dfcf8951c4a269cfddd2a9cf2a05d7f91 | Python | fix | overflow/underflow | overflow/underflow | Adjust the brightness of RGB or Grayscale images. | The current implementation (i.e. without clipping before conversion) introduces different behavior for images with different original data types, i.e. uint8 or float32. | data processing | image processing | images, adjust brightness | limit input range | limit input range | clip image into [0.0, 1.0] before converting back to original data type in 'adjust_brightness' | N/A | adjusted = clip_ops.clip_by_value(adjusted, 0.0, 1.0) | def testNegativeDeltaFloat(self): + x_shape = [2, 2, 3] + x_data = [0, 5, 13, 10, 135, 226, 37, 8, 245, 90, 255, 1] + x_np = np.array(x_data, dtype=np.float32).reshape(x_shape) / 255. + + y_data = [0, 0, 3, 0, 125, 216, 27, 0, 235, 80, 245, 0] + y_np = np.array(y_data, dtype=np.float32).reshape(x_shape) / 255. + + self._testBrightness(x_np, y_np, delta=-10. / 255.) |
|||
221 | Tensorflow/Keras | ec58d4042790e71172964383f737b249289d15af | Python | fix | underflow | underflow | statistical distributions | statistical distributions | gumbel distribution | limit input range | limit input range | set min value with np.finfo(np_dtype).tiny | ||||||||
222 | Tensorflow/Keras | 096ab75275862f973b2fd1a369a9fd25952a6c37 | C++ | fix | overflow | overflow | text files larger than 2B words overflows | other | NLP | word to vec embedding, size | increase variable precision/change variable type | increase variable precision | increase precision of corpus size from int32 to int64 | int32 corpus_size_ = 0; | int64 corpus_size_ = 0; | |||||
223 | Tensorflow/Keras | e6e06b2fc89d41556d159d1181a558f8f5352b87 | C++ | fix | overflow | overflow | other | other | strings | rewrite math formula | rewrite math formula | rewrite formula for checking overflow | if (new_v < v) { | if (new_v / 8 < v) { | // (2^64-1)*10+9 + TestConsumeLeadingDigits("184467440737095516159yz", -1, + "184467440737095516159yz"); |
|||||
224 | Tensorflow/Keras | 4ad8912996a25136a280312de3801f30dd4d4a74 | C++ | unit test | overflow | overflow | overflow in float-int32 cast | quantization | quantization | quantization | use a different algorithm | use a different algorithm | const int values_count = sizeof(T) == 256 ? 256 : 50000; if (sizeof(T) == 256) { input_array(i) = Eigen::NumTraits<T>::lowest() + static_cast<int32>(q_range / values_count * i); |
const int values_count = sizeof(T) == 1 ? 256 : 50000; if (sizeof(T) == 1) { int64 offset = static_cast<int64>(q_range / values_count * i); + input_array(i) = static_cast<int32>( + Eigen::NumTraits<T>::lowest() + + std::min<int64>(Eigen::NumTraits<T>::highest(), offset)); |
||||||
225 | Tensorflow/Keras | f4264cb8e1ea70c612170ed72b9fe0382d1967a0 | C++ | fix | overflow | overflow | overflow when using float in eigen to quantize to QInt32 | quantization | quantization | quantization | limit input range | limit input range | use bounds that can be converted back to int32 without going outside the range of an int32. | static float upper_bound_float() { + return Eigen::numext::mini( + static_cast<float>(Eigen::NumTraits<T>::highest()), +2.147483520e+09f); static float lower_bound_float() { + return Eigen::numext::maxi( + static_cast<float>(Eigen::NumTraits<T>::lowest()), -2.147483648e+09f); |
||||||
226 | Tensorflow/Keras | 6047c6977dbc30f018b8b3ea0486ca907901dabb | C++ | fix | overflow | overflow | data processing | data | png I/O | increase variable precision/change variable type | increase variable precision | Force height*row_bytes computations to use 64 bits. | N/A | int64 height = static_cast<int64>(height_in); | ||||||
227 | Tensorflow/Keras | f4686d27a705bd547b828693462714d31bfd21ce | C++ | fix | overflow | overflow | static_cast overflow in WorkSharder Shard | data processing | data | shard, dataset | increase variable precision/change variable type | increase variable precision | cast intermediate variable to higher precision | const int num_shards = std::max( - 1, std::min<int>(num_workers, total * cost_per_unit / kMinCostPerShard)); |
const int num_shards = + std::max<int>(1, std::min(static_cast<int64>(num_workers), + total * cost_per_unit / kMinCostPerShard)); |
TEST(Shard, OverflowTest) { + thread::ThreadPool threads(Env::Default(), "test", 3); + mutex mu; + for (auto workers : {1, 2, 3}) { + const int64 total_elements = 1LL << 32; + const int64 cost_per_unit = 10000; + int num_shards = 0; + int64 num_elements = 0; + Shard(workers, &threads, total_elements, cost_per_unit, + [&mu, &num_shards, &num_elements](int64 start, int64 limit) { + mutex_lock l(mu); + ++num_shards; + num_elements += limit - start; + }); + EXPECT_EQ(num_shards, workers); + EXPECT_EQ(num_elements, total_elements); + } +} |
||||
228 | PyTorch | e6000a7c045cbece5fbfd7d933c39e40b1625037 | Python | Disable test | loss of precision | Quantize = convert from float 32 ro int 8, dequantize = convert from int 8 to float 32 During training, all calculations are done in floating point, with fake_quant modules modeling the effects of quantization by clamping and rounding to simulate the effects of INT8. After model conversion, weights and activations are quantized, and activations are fused into the preceding layer where possible. It is commonly used with CNNs and yields a higher accuracy compared to static quantization. Quantization Aware Training is also known as QAT. |
test_numerical_consistency_per_tensor in test_fake_quant is failing on Windows. The test is comparing numerical consistency between CPU quantize/dequantize op and the CPU fake quantize op. | quantization | quantization | testing, quantization | disable test/warning | disable precision test | Temporarily disables a test for comparing numerical consistency between CPU quantize/dequantize op and the CPU fake quantize op | |||||||
229 | PyTorch | 02d318461e5c7bded304c42ed7075de84f71dac6 | Python | Disable test | loss of precision | Quantized operations require FBGEMM. FBGEMM (Facebook GEneral Matrix Multiplication) is a low-precision, high-performance matrix-matrix multiplications and convolution library for server-side inference. FBGEMM is only optimized for CPUs with instruction set support avx2 or newer. In Pytorch, quantization currently supports two backends: fbgemm (for use on x86,) and qnnpack (for use on the ARM) |
Failing test | quantization | quantization | testing, quantization | disable test/warning | disable precision test | Temporarily disable test_numerical_consistency_per_channel due to failure | |||||||
230 | PyTorch | b7038f7c37e955f7400459bbfc9382a77b16377d | Python | Change exception to a warning | loss of precision | exception | This test script compares if two values are “close enough” and handles +inf, -inf, nan | numerical differences raise exception | precision tests/speed benchmarks | accuracy testing | tensor compare, testing accuracy, JIT | relax accuracy test tolerance | relax accuracy test tolerance | changes errors to warnings when numerical differences found by replacing self.assertRaisesRegex with assertWarnsRegex | ||||||
231 | PyTorch | 032e4f81a8df14fe8b7177957f73567fa04919e8 | Python | Unit test | overflow | Test for overflow does not verify that all listed conditions throw, just the first one | precision tests/speed benchmarks | overflow test | testing | fix test/warning | fix overflow check | Update test to check that the correct exceptions are raised when attempting to convert and invalid value to a certain type. Refactor code: add 'with' and 'assert' for every condition. | ||||||||
232 | PyTorch | 86abc8cd481bfa2b9bb741722770796966778ab1 | C++ | Change exception to a warning | overflow | PyTorch has a JIT compiler and a method to allow for inserting instructions as the compiler is compiling on the go. In this case an overflow check is inserted. | other | other | C++ interpreter | fix test/warning | change variable type, change exception to a warning | Change an exception to instead just raise a non fatal warning, also, this changes a cast to use unsigned variants of 16 and 64 bit integers, which both allows double the amount of positive values these types can represent, as well as giving them well defined overflow behavior, which avoids undefined behavior in the event that they do overflow, and will simply wrap arround. | throw std::runtime_error("safe_narrow_cast<>() failed due to overflow"); safe_narrow_cast<int16_t, int64_t>(N)); |
TORCH_WARN( + "ATTENTION: your model computation is overflowing, safe_narrow_cast<>() failed"); + return v; safe_narrow_cast<uint16_t, uint64_t>(N)); |
||||||
233 | PyTorch | 2171f910531be28f7d5dd8e6ab8bff3a5486e6fd | Python | Unit test | overflow | ROCm is the first open-source software development platform for HPC/Hyperscale-class GPU computing | The test was previously turned off because of broken continuous integration on ROCm | precision tests/speed benchmarks | overflow test | testing overflow, Cuda | add overflow check | add overflow check | reenable cuda_kernel_loop_overflow_large test | def test_cuda_kernel_loop_overflow_large(self): # Make sure input.numel() > INT_MAX is handled: x = torch.randn(1, 1, 1, 2**31, dtype=torch.float16, device="cuda") |
||||||
234 | PyTorch | 916eee182c9dc8d335501f6672842c6d29f0af58 | Python | Unit test | overflow | A test that checks input shape of 2D convolution prints overflowed integers. Bug in error message: RuntimeError: Expected 4-dimensional input for 4-dimensional weight 6 1 5 5 2323362894317625376, but got 5-dimensional input of size [1, 10, 1, 28, 28] instead Correct error message: RuntimeError: Expected 4-dimensional input for 4-dimensional weight 6 1 5 5, but got 5-dimensional input of size [1, 10, 1, 28, 28] instead |
CNN operations | convolution | 2D convolution | fix test/warning | correct error message | add unit test to test shape mismatch for 2d convolutions | def test_mismatch_shape_conv2d(self): + x = torch.randn(1, 10, 1, 28, 28) + w = torch.randn(6, 1, 5, 5) + + with self.assertRaisesRegex(RuntimeError, + r'Expected 4-dimensional input for 4-dimensional weight 6 1 5 5,' + + r' but got 5-dimensional input of size \[1, 10, 1, 28, 28\] instead'): + + F.conv2d(x, w) |
|||||||
235 | PyTorch | 3805be62c1bb10b8bf4e645aac30d89efd8f79ab | Python | Unit test | overflow | quantization test fails due to overflow when width parameter is specified | quantization | quantization | quantization, testing | increase variable precision/change variable type | increase variable precision | skip test and get rid of width parameter. Note: no longer in pytorch | @given(A=hu.tensor(shapes=((3, 4, 5),), qparams=hu.qparams()), - b=st.floats(allow_infinity=False, allow_nan=False, width=32)) |
@unittest.skip("FIXME: Failing due to overflow error without width option") @given(A=hu.tensor(shapes=((3, 4, 5),), qparams=hu.qparams()), + b=st.floats(allow_infinity=False, allow_nan=False)) |
||||||
236 | PyTorch | 1ed488da4f88ec7b85ba5f6a4113908dda3681e3 | Python | Unit test | loss of precision | non-standard precision | non-standard precision | custom precision testing | fix test/warning | fix precision test | fix precision test for inplace mode |
for inplace in (True, False): - if len(decl) == 3: - name, constr, arg_constr = decl - desc = '' - elif len(decl) == 4: - name, constr, arg_constr, desc = decl if inplace: name = name + '_' if not hasattr(tensor, name): @@ -335,8 +337,6 @@ for decl in tests: if desc: test_name += '_' + desc - precision = custom_precision.get(name, TestCuda.precision) |
for t in types: tensor = t() gpu_tensor = get_gpu_type(t)() + if len(decl) == 3: + name, constr, arg_constr = decl + desc = '' + elif len(decl) == 4: + name, constr, arg_constr, desc = decl + + precision = custom_precision.get(name, TestCuda.precision) |
|||||||
237 | Tensorflow/Keras | 37af1b8790d633b9002ab04a0e664ca3c1dbe508 | Python | fix | loss of precision | data processing | batch normalization | batch normalization | rewrite math formula | rewrite math formula | Do not use moving average in batch normalization since the method moments that calculate the mean of input that is utilized already implements this ogic in a numerically stable way | |||||||||
238 | Tensorflow/Keras | f93960d0afdcf59457b614158ee5575ca2acfe15 | Python | fix | N/A | incorrect comment about numerical stability | statistical distributions | statistical distributions | Beta distribution | fix test/warning | delete incorrect comment | |||||||||
239 | PyTorch | 8c8918c3412aa1a7a50df02cddfd66be948d2ace | C++ | Fix | overflow | non-standard precision | non-standard precision | half precision, overflow testing | fix test/warning | fix overflow check | make half precision overflow checks consistent with other types | template<> bool overflows<Half, double>(double f) { + using limit = std::numeric_limits<double>; + if (limit::has_infinity && std::isinf(f)) { + return false; + } |
||||||||
240 | PyTorch | 79c3ebc040c4bac896477030d8af4ac94bc6f440 | Python | Unit test | loss of precision | Unit test was not aware of the precision of inputs | activation functions | activation functions | testing | fix test/warning | fix overflow check | Add argument to make assertion aware of precision of inputs. | ||||||||
241 | PyTorch | 2b902e9738f5346050814b40db3ec67faf37128a | C++ | fix | loss of precision | An offset within an array or other data structure object is an integer indicating the distance (displacement) between the beginning of the object and a given element or point, presumably within the same object. The concept of a distance is valid only if all elements of the object are of the same size (typically given in bytes or words). For example, in A as an array of characters containing "abcdef", the fourth element containing the character 'd' has an offset of three from the start of A. In assembly language an offset usually denotes the number of address locations added to a base address in order to get to a specific absolute address. |
offset numerical bug when casting | quantization | quantization | quantization, caffe2, type conversion | increase variable precision/change variable type | change variable type | change all_offsets variable type from float to int32_t | std::vector<std::vector<float>>* all_offsets) | std::vector<std::vector<int32_t>>* all_offsets) | |||||
242 | PyTorch | 5292685d2f144d9781ab8b7991c0a1153098a477 | C++ | Fix | loss of precision | -inf, NaN | Logarithms of determinants of large positive definite matrices appear ubiquitously in ML. Log-determinant computation involves the Cholesky decomposition | loss of precision when diagonal matrix contains small values. log determinant of a square matrix causes -inf when the matrix entries are very small numbers. Result is -inf if input has zero log determinant. If input has negative determinant, the result is NaN | linear algebra | linear algebra | linear algebra, log of matrix determinant | use a different algorithm | use a different algorithm | Use sign of diagonal of U instead of the matrix determinant when diag_U has a lot small values. |
determinant of a matrix, log of a matrix | |||||
243 | PyTorch | 67f2039f4ce233754910ebc24fbfcc8bc68685ae | Python | Fix | inefficient algorithm | slow execution | The binomial distribution is used when there are exactly two mutually exclusive outcomes of a trial, e.g., a coin toss has only two outcomes: heads and tails. A single binary outcome has a Bernoulli distribution, and a sequence of binary outcomes has a Binomial distribution.. The binomial distribution gives the discrete probability distribution P_p(n|N) of obtaining exactly n successes out of N Bernoulli trials (where the result of each Bernoulli trial is true with probability p and false with probability q=1-p). | Log probability in binomial distribution has numerical stability issues. issue manifests itself when `total_count` is high and `probs` is very low. step size unreasonably small | statistical distributions | statistical distributions | distributions, log probability, binomial distribution | rewrite math formula | rewrite math formula | log probability method in binomial distribution is unstable | max_val = (-self.logits).clamp(min=0.0) value * self.logits + self.total_count * max_val - - self.total_count * torch.log1p((self.logits + 2 * max_val).exp())) |
value * self.logits - self.total_count * torch.log1p(self.logits.exp())) | @unittest.skipIf(not TEST_NUMPY, "NumPy not found") def test_binomial_log_prob_float(self): probs = torch.tensor([1e-5, 0.99999], dtype=torch.float) total_count = 1000000. x = torch.tensor([10, 9999], dtype=torch.float) expected = scipy.stats.binom(total_count, probs.numpy()).logpmf(x.numpy()) log_prob = Binomial(total_count, probs).log_prob(x) # Comparison is again scipy distributions which use float64. self.assertTrue(np.allclose(log_prob, expected, rtol=0.05)) logits = probs_to_logits(probs, is_binary=True) log_prob = Binomial(total_count, logits=logits).log_prob(x) self.assertTrue(np.allclose(log_prob, expected, rtol=0.05)) |
|||
244 | PyTorch | a17c0118a52d34c97ab48bae416ae1896ad14e56 | C++ | Fix | overflow | NaN loss | overflow | Binary Cross Entropy (BCE) is a loss function used for binary classification taks to measure the difference between true labels and predicted labels. BCE with logits takes logits, not predicted labels as input, but serves the same purpose. | Binary cross entropy with logits is unstable with positive weights argument when logits are large negative values and results in an inf. Positive weight is a weight of positive examples and must be a vector with length equal to the number of classes. | loss functions | loss functions | binary cross entropy loss | rewrite math formula | rewrite math formula | instead of multipling by 1 + exp(-input), add exp(-input-max_val) | loss = (1 - target).mul_(input).add_(log_weight.mul_((-max_val).exp_().mul_(1 + (-input).exp_()).log_().add_(max_val))); | loss = (1 - target).mul_(input).add_(log_weight.mul_(((-max_val).exp_().add_((-input - max_val).exp_())).log_().add_(max_val))); |
def test_bce_with_logits_stability(self): + output = torch.tensor([0., -120.]) + target = torch.tensor([0., 1.]) + pos_weight = torch.tensor([1., 1.]) + + out1 = nn.BCEWithLogitsLoss()(output, target) + self.assertTrue(torch.isfinite(out1).all().item()) + + out2 = nn.BCEWithLogitsLoss(pos_weight=pos_weight)(output, target) + self.assertTrue(torch.isfinite(out2).all().item()) |
H_p(q) = -1/N * sum_from_i_to_N(y_i * log(p(y_i)) + (1-y_i) * log (1-p(y_i)) | log, multiply |
245 | PyTorch | 00d2befba11a1e9c85146a4470721eb75596d5b7 | Cuda | Fix | loss of precision | TH = TorcH This is in directory aten/src, which contains the low-level tensor libraries for PyTorch, as well as the new ATen C++ bindings. The low-level libraries trace their lineage from the original Torch. There are multiple variants of the library, summarized here: TH = TorcH THC = TorcH Cuda THCS = TorcH Cuda Sparse (now defunct) THCUNN = TorcH CUda Neural Network (see cunn) THNN = TorcH Neural Network (now defunct) THS = TorcH Sparse (now defunct) |
unstable TorcH Cuda Tensor outer dimentions (THTensor_varOuterDim) | tensor math | tensor math | low level tensor math, variance calculation, GPU | increase variable precision/change variable type | change variable type | Use Accreal variable type instead of real | def test_var_stability(self): tensor = torch.FloatTensor([2281.5, 2281.25]).cuda() # Stability for inner dim self.assertEqual(tensor.var(0)[0], 0.03125) # General stability self.assertEqual(tensor.var(), 0.03125) # Stability for outer dimensions tensor = tensor.unsqueeze(1) self.assertEqual(tensor.var(0)[0], 0.03125) |
variance | |||||
246 | PyTorch | 72a257584efa7fb63b14f09d19efc96caa5d6e4d | Cuda | Fix | overflow/underflow | overflow/underflow | Log sigmoid is a logistic non-linear activation function. However, typically softmax is prefered over sigmoid | numerically unstable logsigmoid | activation functions | activation functions | log sigmoid | rewrite math formula | rewrite math formula | rewrite formula for log sigmoid considering the maximum representable values | const float fmax = (float)((int32_t)(uint32_t)qmax - (int32_t)(uint32_t)zero_point); const T z = THCNumerics<T>::exp(- *input); - *gradInput = *gradOutput * z / (1.f + z); |
const T max = fmaxType(0.f, -*input); + const T z = THCNumerics<T>::exp(-max) + THCNumerics<T>::exp(-*input -max); + T max_deriv = 0.f; + T sign = -1.f; + if (*input < 0.f){ + max_deriv = -1.f; + sign = 1.f; + } + *gradInput = *gradOutput * (-max_deriv - sign*((z - 1.f)/z)); + *gradInput = *gradOutput * (-max_deriv - sign*((z - 1.f)/z)); |
log(1/1+e^(-x)) | log sigmoid | ||
247 | PyTorch | f555c6308c534dd3964d106f2551067fad6edaec | Cuda, C++ | Fix | loss of precision | Normalized gradient helps to ameliorate issues with gradient descent such as slow convergenece and getting stuck in saddle points. Normalized gradient is the gradient divided by its magnitude. Therefore, when normalized gradient only provides the direction for gradient descent, but does not affect magnitude of step size. Gradient magnitude is calculated as the square root of sum of squares of the gradient vector. | normalization operation for gradient is unstable due to sum of squares operation | gradients/derivatives | gradients | gradient normalization | rewrite math formula | rewrite math formula | rewrite math formula | in Cuda: grad_mat[index] = (y_ij / x_ij) * (dy_ij - y_ij) * row_sum; in C++: gradInMat = ((outputMat / inputMat) * (gradOutMat - outputMat)).rowwise() * (gradOutMat * inputMat).colwise().sum(); |
in Cuda: grad_mat[index] = (dy_ij / row_norm) - ((x_ij / row_norm_3) * row_sum); in C++: auto square = inputMat.square(); auto norm = square.colwise().sum().sqrt(); gradInMat = gradOutMat.rowwise() * norm.inverse() - ((inputMat.rowwise() / norm.pow(3)).rowwise() * (gradOutMat * inputMat).colwise().sum()); |
gradient/||gradient|| | sum of squares, square root | |||
248 | PyTorch | 9a153412fd4f78b9a9b59bbf85a358339fb69613 | C++, Python, Cuda | Fix | underflow | Rsample offers a reparametrization trick, where the parameterized random variable can be constructed via a parameterized deterministic function of a parameter-free random variable. The reparameterized sample therefore becomes differentiable. sample and rsample both generate samples from the distribution, but only rsample supports differentiating through the sampler. You should use rsample whenever you need to compute gradients of distribution parameters with respect to functions of samples, e.g. in variational inference. SOURCE: https://forum.pyro.ai/t/sample-vs-rsample/2344 sample is literally rsample wrapped in with torch.no_grad(), so when you don't need gradients. SOURCE: https://github.com/cornellius-gp/gpytorch/issues/764 |
underflow issue in method rsample of dirichlet distribution class | statistical distributions | statistical distributions | dirichlet distribution, sampling, forward pass | use a different algorithm | use a different algorithm | adds a `torch._sample_dirichlet` method in `Distributions.cpp` | def test_beta_underflow(self): + # For low values of (alpha, beta), the gamma samples can underflow + # with float32 and result in a spurious mode at 0.5. To prevent this, + # torch._sample_dirichlet works with double precision for intermediate + # calculations. + set_rng_seed(1) + num_samples = 50000 + for dtype in [torch.float, torch.double]: + conc = torch.tensor(1e-2, dtype=dtype) + beta_samples = Beta(conc, conc).sample([num_samples]) + self.assertEqual((beta_samples == 0).sum(), 0) + self.assertEqual((beta_samples == 1).sum(), 0) + # assert support is concentrated around 0 and 1 + frac_zeros = float((beta_samples < 0.1).sum()) / num_samples + frac_ones = float((beta_samples > 0.9).sum()) / num_samples + self.assertEqual(frac_zeros, 0.5, 0.05) + self.assertEqual(frac_ones, 0.5, 0.05) + + @unittest.skipIf(not TEST_CUDA, "CUDA not found") + def test_beta_underflow_gpu(self): + set_rng_seed(1) + num_samples = 50000 + conc = torch.tensor(1e-2, dtype=torch.float64).cuda() + beta_samples = Beta(conc, conc).sample([num_samples]) + self.assertEqual((beta_samples == 0).sum(), 0) + self.assertEqual((beta_samples == 1).sum(), 0) + # assert support is concentrated around 0 and 1 + frac_zeros = float((beta_samples < 0.1).sum()) / num_samples + frac_ones = float((beta_samples > 0.9).sum()) / num_samples + # TODO: increase precision once imbalance on GPU is fixed. + self.assertEqual(frac_zeros, 0.5, 0.12) + self.assertEqual(frac_ones, 0.5, 0.12) |
||||||
249 | PyTorch | 74819087de17de4c8215a7f631d8d4d18dd13d45 | C++ | Fix | inefficient algorithm | Mixed precision training with DDP (distributed data parallelization) randomly hangs. The reason for that is that take_tensors will generate a list of bucketed tensors in an undeterministic order, because the key to the map is a pointer. | non-standard precision | non-standard precision | distributed data parallelization, mixed precision | use a different algorithm | use a different algorithm | use map instead of unordered map to generate an ordered list of bucketed tensors for parallel training | std::unordered_map<at::Type*, TensorGroup> groups; | std::map<TypeID, TensorGroup> groups; | ||||||
250 | PyTorch | 73bdb661feb195a8b98366db5750b998c025f709 | Python | Unit test | loss of precision | BCELoss's outputs and gradInput computations are accurate to around 1e-6 on float types (as a relative value, not absolute), which is reasonable. However, the tests use absolute thresholds: the accumulation of 5 gradInputs has to have error less than 0.0002. | loss functions | loss functions | binary cross entropy loss, testing precision | rewrite math formula | rewrite math formula | restrict input to [0.028, 1- 0.028] instead of [0.02, 1- 0.02] to decrease error The worse case for BCELoss's gradInput for each element may be described as 1 / ( (1-x) * x ). Previously, the input to the test was restricted to [0.02, 1- 0.02], resulting in worse-case largest gradInput of 50, resulting in a total accumulated grad of 50*5 = 250, resulting in an error of 250 * 1e-6 = 0.00025, which was too big. By restricting x to [0.028, 1- 0.028] we get a worse case of 36.74, resulting in a total accumulated grad of 184, which is less than the 200 needed to have error less than 0.0002. |
input_fn=lambda: torch.rand(15, 10).clamp_(2e-2, 1 - 2e-2) | input_fn=lambda: torch.rand(15, 10).clamp_(2.8e-2, 1 - 2.8e-2), | ||||||
251 | PyTorch | 912ee4e40a9f2f2f156e94a76a521d3ed4f49bd0 | Python | Unit test | loss of precision | failing unit test | linear algebra | linear algebra | testing, sparse to dense matrix conversion | rewrite math formula | rewrite math formula | elements=st.floats(min_value=0.5, max_value=10), dtype=dt)) D = np.random.uniform(0, 1, size=(first_dim,) + X.shape[1:]) |
elements=st.floats(min_value=0, max_value=1), dtype=dt)) D = np.zeros((first_dim,) + X.shape[1:]) |
|||||||
252 | PyTorch | b1fa9d2b06714de099e3ae1141d15dcbaba78dd3 | C | Fix | overflow | THFile is for loading data from disk or memory, but this is no longer part of PyTorch |
data processing | data | data loading | increase variable precision/change variable type | increase variable precision, add overflow check | increase precision to long, add logic to check that smaller than long max |