DeepStability

A Continuously Growing Database of Numerical Stability Vulnerabilities of Common Numerical Methods in Deep Learning

Index	Library	Commit hash	Language	Type of commit	Root Cause	Manifestation/End User Impact	IEEE arithmetic exception type	Background	Problem	DL Topic - level 1	DL Topic - level 2	DL Topic - level 3	Patch type - level 1	Patch type - level 2	Patch type - level 3	Old Solution	New Solution	Test	Math operation	References
1	PyTorch	ac72881f3ff8c46c2a5cf8b09d02babf46bc4c85	CUDA	Fix	loss of precision	inaccurate result of mean in batch normalization	Inexact	Sync batch norm applies Batch Normalization over a N-Dimensional input (a mini-batch of [N-2]D inputs with additional channel dimension) y = ((x - E[x])/sqrt(Var[x] + epsilon)) * alpha + beta	numerical issue in CUDA channels-last SyncBatchNorm, numerical issue of CUDA channels-last SyncBatchNorm', apex SBN channels-last also has this issue	data processing	batch normalization	batch normalization, Cuda	rewrite math formula	rewrite math formula	Replace: div_roundup() with ATenCeilDiv int div_roundup(int x, int y) { return lastPow2(1 + (x-1)/y); }, where lastPow2 returns 2**floor(log2(n)) ATenCeilDiv(T a, T b) { return (a + b - 1) / b; }	int div_roundup(int x, int y) { return lastPow2(1 + (x-1)/y); } static int lastPow2(unsigned int n) { n \|= (n >> 1); n \|= (n >> 2); n \|= (n >> 4); n \|= (n >> 8); n \|= (n >> 16); return std::max<int>(1, n - (n >> 1)); }, where \|= is a bitwise or opearator	ATenCeilDiv(T a, T b) { return (a + b - 1) / b; }	def _batch_norm_stats(data): mean1, _ = torch.batch_norm_stats(data, 1e-5) mean2, _ = torch.batch_norm_stats(data.to(memory_format=torch.channels_last), 1e-5) mean_ref = torch.mean(data, (0, 2, 3), keepdim=False) self.assertEqual(mean_ref, mean1) self.assertEqual(mean_ref, mean2)	division round up	https://arxiv.org/abs/1502.03167 https://pytorch.org/docs/stable/generated/torch.nn.SyncBatchNorm.html
2	PyTorch	dfc7fa03e5d33f909b9d7853dd001086f5d782a0	Python	Fix	loss of precision	inaccurate result of gradient	Inexact	lower–upper (LU) decomposition (also called LU factorization) factors a matrix as the product of a lower triangular matrix and an upper triangular matrix. It is a procedure for decomposing an N×N matrix A into a product of a lower triangular matrix L and an upper triangular matrix U, LU=A. Matrix A = LU. In the lower triangular matrix all elements above the diagonal are zero, in the upper triangular matrix, all the elements below the diagonal are zero. LU decomposition is an efficient method used for solving a system of linear equations. Suppose we have B=AX and want to solve for X. (The solution could be X = inverse(A)B. But a matrix inverse is numerically unstable.) Find LU decomposition of A, A = LU. So, B=AX=LUX. Then solve for X with two equations: (1) LY = B and (2) UX = Y	Matrix inverse is numerically unstable, as a result numerical and analytical gradients for LU decomposition are too different. gradients for the LU decomposition calculation is unstable, lu_backward is impelemented as autograd torch.det is using LU in forward, while det_backward is using svd_backward (singular value decomposition). The issue with svd_backward is that it is only stable for inputs with distinct singular values. As a result, TestGradientsCuda::test_fn_gradgrad_linalg_det_cuda_float64 fails on Windows with GPU, which compares the numerical and analytical gradient. SVD_backward is only stable for ranks n - 1 <= r <= n with singular values sufficiently far away from each other.	gradients/derivatives	automatic differentiation	gradients for the LU decomposition, backward pass, autograd, linear algebra operations, determinant of a square matrix	use a different algorithm	use a different algorithm	Replace matrix inverse with solutions to systems of linear triangular equations. System of "triangular" equations refers to the equations having the form of a triangle, because of the lower equations containing only the later variables. However, works only for square matrices of full rank	- I = LU_grad.new_zeros(LU_grad.shape) - I.diagonal(dim1=-2, dim2=-1).fill_(1) - Lt_inv = torch.triangular_solve(I, L, upper=False).solution.transpose(-1, -2) - Ut_inv = torch.triangular_solve(I, U, upper=True).solution.transpose(-1, -2) - - phi_L = (L.transpose(-1, -2) @ LU_grad).tril_() - phi_U = (LU_grad @ U.transpose(-1, -2)).triu_() - - self_grad_perturbed = Lt_inv @ (phi_L + phi_U) @ Ut_inv - return P @ self_grad_perturbed, None, None	phi_L = (L.transpose(-1, -2).conj() @ LU_grad).tril_() + phi_U = (LU_grad @ U.transpose(-1, -2).conj()).triu_() + phi = phi_L + phi_U + X = torch.triangular_solve(phi, L.transpose(-1, -2).conj(), upper=True).solution + A_grad = torch.triangular_solve(X.transpose(-1, -2).conj() @ P.transpose(-1, -2), U, upper=True) \ + .solution.transpose(-1, -2).conj() + + return A_grad, None, None	def sample_inputs_lu(op_info, device, dtype, requires_grad=False): + # not needed once OpInfo tests support Iterables + def generate_samples(): + batch_shapes = ((), (3,), (3, 3)) + for batch_shape, get_infos in product(batch_shapes, (True, False)): + shape = batch_shape + (S, S) + input = make_tensor(shape, device, dtype, requires_grad=requires_grad, low=None, high=None) + yield SampleInput(input, args=(True, get_infos)) + + return list(generate_samples())	matrix inverse, autograd
3	PyTorch	8e507ad00ebdfd0ae84bc03718e9c2cb74b8573b	yaml	Fix	overflow/underflow/loss of precision	Inaccurate result	overflow, underflow, inexact	This script defines derivative formulas and Python signatures of methods on Variables	Division formula in backward pass is unstable, because multiply two values can lead to loss of precision. When divisor value that is squared is large or small, which results in loss of precision. For extremely large values, the divisor may overflow and will evaluate to inf. For extremely small values the divisor will underflow and will evaluate to 0.	gradients/derivatives	automatic differentiation	backward pass, autograd, division, derivative, higher order gradients	rewrite math formula	rewrite math formula	Instead of dividing by other squared, divide by other twice. Mathematically x / y^2 = x / y / y, but if y is a large finite precision floating point number, then by performing y^2 you may lose precision. Successive divisions achieves the same result while not losing as much precision for large values of y	other: -grad * self / (other * other)	other: -grad * (self / other) / other		division
4	PyTorch	fe5d23cf4a9d8f673fb1bfc6e84c642fb6a23182	C++	Fix	loss of precision	incorrect result and NaN	Inexact	Cosine Similarity measures the cosine of the angle between two non-zero vectors of an inner product space. This similarity measurement is particularly concerned with orientation, rather than magnitude. In short, two cosine vectors that are aligned in the same orientation will have a similarity measurement of 1, whereas two vectors aligned perpendicularly will have a similarity of 0. If two vectors are diametrically opposed, meaning they are oriented in exactly opposite directions (i.e. back-to-back), then the similarity measurement is -1. Often, however, Cosine Similarity is used in positive space, between the bounds 0 and 1. Cosine Similarity is not concerned, and does not measure, differences is magnitude (length), and is only a representation of similarities in orientation.	Cosine similarity implementation that may lose precision and return a value greater than 1.0, which is incorrect, because cosine similarity outputs are in range of -1 and 1.	linear algebra	distance	cosine similarity distance	rewrite math formula	rewrite math formula	Use x / sqrt(x * x) instead of x / (sqrt(x) * sqrt(x)) followig scipy implementation	- Tensor n12 = (w1 * w2).rsqrt_().clamp_max(1.0 / eps); - return w12.mul_(n12);	Tensor n12 = (w1 * w2).clamp_min_(eps * eps).sqrt_(); + return w12.div_(n12);	# Check dividing by 0. + input1 = torch.randn(10).requires_grad_() + input2 = torch.zeros_like(input1).requires_grad_() + torch.cosine_similarity(input1, input2, 0).sum().backward() + self.assertEqual(input1.grad, torch.zeros_like(input1)) + self.assertEqual(input2.grad, input1 * 1e8)	reciprocal of square root
5	Tensorflow/Keras	646d25d15910dc5cc3532aebb7e8395487adad4f	C++	Fix	overflow/underflow	softmax output is NaN	overflow, underflow	Softmax is a normalized exponential function that takes a vector of n real values as input and outputs a vector of n real values that represent a probability distribution and sum up to 1. In deep learning classifiers, softmax is used in the last layer, because it normalizes the output of the prior network layer, a vector with size n, to a probability distribution over n predicted output classes.	Direct calculation of the softmax function according to its definition formula is conjugate with numerical issues. Single-precision exp(x) function overflows for x > 89 and underflows for x < −104, and, in turn, cause NaN outputs in the na¨ıve implementations.	activation functions	activation functions	softmax, metal GPU acceleration	use a different algorithm	use a different algorithm	Implement a tree pass softmax algorithm, see algorithm in https://arxiv.org/pdf/2001.04438.pdf				softmax	https://arxiv.org/pdf/2001.04438.pdf
6	Tensorflow/Keras	a3d726ae8246371515a0f666c38668e9da7765f9	C++	Fix	underflow	error due to divide by zero	invalid operation, underflow	compute the centered RMSProp, the gradient is normalized by an estimation of its variance	The denominator in centered RMSProp optimizer does not add a small epsilon as the last operation. This will not be effective af preventing underflow. Given the current formula ms + eps - mg.square, if ms and mg.square are of very similar magnitude, subtracting two similar numbers will lead to loss of significant digits, which has a risk of underflow. Because the epsilon was added to ms prior to that, it will not prevent overflow	optimizers	optimizers	centered RMSprop optimizer	rewrite math formula	rewrite math formula	Rewrite the order of operations. Reordered the sum (ms - mg^2 + epsilon) to add epsilon last for numerical stability both on CPU and GPU.	auto denom = ms + epsilon() - mg.square(); auto denom = epsilon.reshape(single).broadcast(bcast) + ms - mg.square().sqrt()	auto denom = (ms - mg.square()) + epsilon() auto denom = (ms - mg.square()) + epsilon.reshape(single).broadcast(bcast)
7	PyTorch	6a458512c22c908b19f49262fd0f32a14425ec80	C++	Fix	loss of precision	assertion error	Inexact	static _cast converts the type of variable static_cast can perform conversions between pointers to related classes, not only upcasts (from pointer-to-derived to pointer-to-base), but also downcasts (from pointer-to-base to pointer-to-derived). No checks are performed during runtime to guarantee that the object being converted is in fact a full object of the destination type.	function test_computes_cubic_kernel returns an assertion error saying that input is less than 1e-5, which is untrue. The input is slightly larger: 1.0790e-05. The cause is the precision of a variable returned by a function that performs the power operation (x to the power of y). On x86_64 a long double will utilize the x87 (the 8087 was the floating point co-processor of the 8086, now it is on the same die as modern amd64 processor) special and proprietary 80 bit float. This 80 bit floating point type is not a part of the IEEE 754 floating point standard. Even though it has more bits of precision, its lack of standardization and its niche nature means that it will often be the cause of stability issues, and is not worth using.	tensor math	tensor math	power, low level math	increase variable precision/change variable type	increase variable precision/change variable type	Stop using long doubles, they will only cause you trouble. Instead just use the same type as the function input				power	https://en.wikipedia.org/wiki/X87#Performance
8	Tensorflow/Keras	d4b5c606fc9fbd1a20b5b113b4bc831f31d889a3	Python	fix	loss of precision				Dividing by a number that is squared results in dividing by a very large or small number. The square operation could overflow or underflow respectively and if that does not happen, there is a risk of loss of precision due to dividing two very different magnitudes	gradients/derivatives	gradients	gradient	rewrite math formula	rewrite math formula	Avoid a square value in denomator and rewrite division as (-x/y)/y instead of (-x/y^2). They are mathematically equivalent, but the first formula avoids dividing by very large or very small numbers. Proof that they are mathematically equivalent: (-x/y)/y = (-x/y)*(1/y) = -x/(y^2)	math_ops.reduce_sum(grad * math_ops.div(-x, math_ops.square(y))	math_ops.reduce_sum(grad * math_ops.div(math_ops.div(-x, y), y)			division
9	Tensorflow/Keras	2411514c726f4ccd98e864e8b2e253e6df99c39d	C++	fix	loss of precision				The formula for dequantization in quantization range for multiplication is numerically unstable	quantization	quantization	dequantization	rewrite math formula	rewrite math formula	rewrite the order of operations. Specifically, rewrite q_range_min + (input_array - 1_lowest) * q_range_scale to the following: q_range_min - (q_lowest * q_range_scale + input_array * q_range_scale), which is mathematically equivalent	#define DEQUANTIZE_WITH_EIGEN(input_array, q2f) \ - (q2f.range_min + \ - (((input_array.template cast<float>() - q2f.lowest_quantized())) * \ - q2f.range_scale));	#define DEQUANTIZE_WITH_EIGEN(input_array, q2f) \ + ((q2f.range_min - q2f.lowest_quantized() * q2f.range_scale) + \ + input_array.template cast<float>() * q2f.range_scale)	// Test for signed 32 bit. + // Note that we cannot use input mins and maxes that match the range because + // there are 7 too few bits of mantissa accuracy in floats to represent + // 231-1 accurately. Also there is no good fraction to use because 231-1 + // is a mersenne prime. + Tensor input32(DT_QINT32, TensorShape({input_height, input_width})); + + // Use a quantizer centered at 0. + float input_range = 1LL << 25; + int64 num_levels = (1LL << 32) - 1; + float step_size = + static_cast<float>(static_cast<double>(input_range) / num_levels); + float q_compatible_min_value = + roundf(-(input_range / 2.0) / step_size) * step_size; + float q_compatible_max_value = q_compatible_min_value + input_range; + test::FillValues<qint32>(&input32, {-16384, 0, 16256, -13440, -13312, -13184, + 14720, 14848, 14976}); + + Tensor output32 = QuantizedTensorToFloat<qint32>( + input32, q_compatible_min_value, q_compatible_max_value); + test::FillValues<float>(&expected, {-128.0f, 0.0f, 127.0f, -105.0f, -104.0f, + -103.0f, 115.0f, 116.0f, 117.0f}); + // The quantization error in going between 1<<25 and 1<<32 levels. + const double kTolerance = .5 / 128.0; + test::ExpectTensorNear<float>(expected, output32, kTolerance);
10	PyTorch	43ab91118226b330be6d2274a154b98da233d879	C	Fix	loss of precision	Inaccurate result	Inexact	Dirichlet distribution is a family of continuous multivariate probability distributions parameterized by a vector Alpha of positive reals. It is a multivariate generalization of the beta distribution, hence its alternative name of multivariate beta distribution (MBD). Dirichlet distributions are commonly used as prior distributions in Bayesian statistics, and in fact the Dirichlet distribution is the conjugate prior of the categorical distribution and multinomial distribution. In Bayesian probability theory, if the posterior distribution p(θ\|x) and the prior distribution p(θ) are from the same probability distribution family, then the prior and posterior are called conjugate distributions, and the prior is the conjugate prior for the likelihood function. The saddle point technique is a method for deriving an accurate approximation for the probability density function of the mean of a random sample. A point that is not a local extremum yet has zero gradient is called a saddle point, such point can occur in non-convex functions.	low precision of gradient approximation in Dirichlet distribution	statistical distributions	statistical distributions	distributions, Dirichlet distribution, gradient approximation	use a different algorithm	use a different algorithm	Use Taylor expansion and Rice saddle point expansion to approximate gradient and use higher precision types for that computation					https://en.wikipedia.org/wiki/Dirichlet_distribution
11	PyTorch	ae1a972d78950abc4dab372f496914b5e78b9637	C++	Fix	loss of precision	inaccurate result		Log softmax is an activation function used in the last layer of a neural network that outputs log probabilities	loss of precision in log_softmax cpu code when inputs are big but their differences are small	activation functions	activation functions	log softmax	rewrite math formula	rewrite math formula	Rewrite order of operations to avoid loss of significat digits when subtracting two numbers of very similar magnitude. Change order of operations so that a large number is first subtracted by another large number before adding a small number.	tmpsum = max_input + std::log(tmpsum); output_data[d * dim_stride] = input_data[d * dim_stride] - tmpsum;	tmpsum = std::log(tmpsum); output_data[d * dim_stride] = + input_data[d * dim_stride] - max_input - tmpsum;		log(exp(x_i)/sum(exp(x))	log, exp, division, sum
12	PyTorch	0c588a500b2219c028eefe595cff0829fd982f52	Python	Fix	loss of precision			SigmoidCrossEntropyWithLogits computes sigmoid cross entropy given logits. Sigmoid cross-entropy is a Sigmoid activation plus a Cross-Entropy loss.	Using sigmoid followed by a multinomial logistic loss layer can be less stable than a single layer of sigmoid cross entropy with logits	loss functions	loss functions, activation functions	cross entropy, sigmoid	use a different algorithm	use a different algorithm	Use a single layer of sigmoid cross entropy with logits instead. Replace sigmoid + xent loss with SigmoidCrossEntropyWithLogits. The sigmoid layer computes the multinomial logistic loss of the sigmoid of its inputs. It's conceptually identical to a sigmoid layer followed by a multinomial logistic loss layer, but provides a more numerical stable gradient.
13	PyTorch	3d06a1e075ef0e6f4bf862d13e83cdd4b02dbc32	Cuda	Fix	loss of precision			Welford’s method is a usable single-pass method for computing the variance. It can be derived by looking at the differences between the sums of squared differences for N and N-1 samples. Algorithm: variance(samples): M := 0 S := 0 for k from 1 to N: x := samples[k] oldM := M M := M + (x-M)/k S := S + (x-M)*(x-oldM) return S/(N-1)	THCTensor_varInnermostDim numerically unstable	tensor math	tensor math	low level tensor math, variance calculation, GPU	use a different algorithm	use a different algorithm	Make THCTensor_varInnermostDim numerically stable using Welford's algorithm (#3425) * Use Welford's algorithm when reducing along inner dimension for THCTensor's variance fn * Use accreals in THCTensor's varInnermostDim				variance	https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Welford's_online_algorithm
14	PyTorch	638f0b5d78fe5ff2e484dc573c35b97a4bcf4e82	Python	Fix	invalid input	loss = NaN	invalid operation	Negative log likelihood loss with Poisson distribution of target. The Poisson distribution is used to model the number of events occurring within a given time interval. target∼Poisson(input)loss(input,target)=input−target∗log(input)+log(target!)	log(0) = NaN in poisson negative log likelihood loss function	loss functions	loss functions	loss, poisson negative log likelihood loss	rewrite math formula	rewrite math formula	Add small epsilon to prevent log(0) following keras implementation, eps=1e-8	`input - target * log(input)`. Default: True	log_input=False. Default: 1e-8 `input - target * log(input+eps)`. Defrue		log	https://pytorch.org/docs/stable/generated/torch.nn.PoissonNLLLoss.html
15	PyTorch	81b995514ea908b635d725e11d1b91ac7ad03eb0	C	Fix	overflow/loss of precision			Welford’s method is a usable single-pass method for computing the variance. It can be derived by looking at the differences between the sums of squared differences for N and N-1 samples. Algorithm: variance(samples): M := 0 S := 0 for k from 1 to N: x := samples[k] oldM := M M := M + (x-M)/k S := S + (x-M)*(x-oldM) return S/(N-1)	numerical stability of std and var of THTensor, formulas for the variance may involve sums of squares, which causes loss of precision or overflow when dealing with large values	tensor math	tensor math	low level tensor math, variance and standard deviation calculation, CPU	use a different algorithm	use a different algorithm	Use Welford’s algorithm for better numerical stability			tensor = torch.FloatTensor([1.0, 2.0, 3.0]) self.assertEqual(tensor.var(unbiased=True), 1.0) self.assertEqual(tensor.var(unbiased=False), 2.0 / 3.0)	variance, standard deviation
16	PyTorch	455038e470dd60dae45f68948ae876b1931a8bf0	Cuda	Fix	overflow/underflow			Spatial logsoftmax computes the log of spatial softmax. Spatial softmax returns the expected pixel locations of each feature map in a CNN and hence, be better described as spatial soft argmax. It is defined in https://arxiv.org/pdf/1504.00702.pdf. Each output channel of the softmax is a probability distribution over the location of a feature in the image. To convert from this distribution to a coordinate representation (fcx, fcy), the network calculates the expected image position of each feature, yielding a 2D coordinate for each channel. s_cij = e^(a_cij) / sum_from_i'_to_j'(e^(a_ci'j') ), where i and j are coordinates specifing location in an image	Spatial log softmax in CUDA backend for the Neural Network Package is not stable	activation functions	activation functions	spatial log softmax, CNN	rewrite math formula	rewrite math formula	It appears to be reducing the sum by the maxiumum input vector at each iteration of accumulating sum. This may be to ensure that the input to exp() is not too large.	sum += THCNumerics<T>::exp(input[inputStartIndex + i]); sum = AccumT(1) / sum; output[outputIndex] = ScalarConvert<AccumT, T>::to( - THCNumerics<AccumT>::log(sum * THCNumerics<T>::exp(input[inputStartIndex + i])));	T maxInput = input[inputStartIndex]; + for (int i = 1; i < classSize; i++) { + T value = input[inputStartIndex + i]; + maxInput = THCNumerics<T>::ge(maxInput, value) ? maxInput : value; + } + sum += THCNumerics<T>::exp(input[inputStartIndex + i] - maxInput); + T logsum = maxInput + ScalarConvert<AccumT, T>::to(THCNumerics<AccumT>::log(sum)); + output[outputIndex] = input[inputStartIndex + i] - logsum;		spatial logsoftmax, log, exp, scalar convert	https://arxiv.org/pdf/1504.00702.pdf
17	PyTorch	c010ef7f0c6d837809a7e973048afac76373e3de	Cuda	Fix	overflow			Cuda block is a group of threads that execute the same task. CUDA blocks are grouped into a grid. A kernel (i.e.: Cuda functipn) is executed as a grid of blocks of threads.	Overflow issue in GET_BLOCKS Cuda function that returns the number of blocks used for scheduling blocks in Cuda device (i.e.: Nvidia GPU), because addition operations on N could cause an overflow for large N.	other	Cuda blocks	Cuda thread scheduling	rewrite math formula	rewrite math formula	Rather than directly adding to N, rearrange the operations to shrink N first.	(N + CUDA_NUM_THREADS - 1) / CUDA_NUM_THREADS;	auto block_num = (N - 1) / CUDA_NUM_THREADS + 1;		division
18	PyTorch	6be3e5d3bb00a288da51bd368c5342c8676bbcf7	Python	Unit test	loss of precision			One of basic units of computation in Caffe2 are the Operators. Operators in Caffe2 are kind of like functions. CAFFE (Convolutional Architecture for Fast Feature Embedding) is a deep learning framework, originally developed at University of California, Berkeley. It is open source, under a BSD license. It is written in C++, with a Python interface.	unstable formula for updating gradient and momentum for adagrad optimizer in Caffe 2 in operators test script	optimizers	optimizers	adagrad testing, optimizer, gradients, caffe2, weight decay, momentum	rewrite math formula	rewrite math formula	Rewrite formula, specifically change updating gradient using a temp variable (change x+= y to temp = x + y)	grad += weight_decay * param_in_f32	grad_temp = grad + weight_decay * param_in_f32
19	PyTorch	0b7e8323256e56728e1ffc9ee5d701987af3d06c	C++	Unit test	overflow			The primary difference between const and constexpr variables is that the initialization of a const variable can be deferred until run time. A constexpr variable must be initialized at compile time.	signed integer overflow of variable range	other	random number generator	random number generator testing	increase variable precision/change variable type	increase variable precision/change variable type	change type of variable range from signed to unsigned int 64 bits and change type from const auto to constexpr and	const int64_t max_val = std::is_floating_point<T>::value ? int64_max_val : static_cast<int64_t>(t_max_val); range = *to - from; range = max_val - from + 1;	range = static_cast<uint64_t>(*to) - static_cast<uint64_t>(from); range = static_cast<uint64_t>(max_val) - static_cast<uint64_t>(from) + 1;
20	PyTorch	470c496eb224bdd735eea1accf7269dfdd87d49f	Python	Fix	loss of precision			Cholesky inverse = Compute inverse of Hermitian positive definite matrix using Cholesky factorization inverse(S) = inverse(LL*)	In multivariate normal distribution class, there is a function for computing the precision matrix that uses inverse, which is numerically unstable	statistical distributions	statistical distributions	multivariate normal distribution, precision matrix	use a different algorithm	use a different algorithm	Replace the naive inverse with a cholesky inverse for improved stability	- scale_tril_inv = torch.inverse(self._unbroadcasted_scale_tril) - return torch.matmul(scale_tril_inv.transpose(-1, -2), scale_tril_inv).expand(	identity = torch.eye(self.loc.size(-1), device=self.loc.device, dtype=self.loc.dtype) + # TODO: use cholesky_inverse when its batching is supported + return torch.cholesky_solve(identity, self._unbroadcasted_scale_tril).expand(		matrix inverse
21	PyTorch	071971476d7431a24e527bdc181981678055a95d	Python	Fix	overflow			torch.clamp(input, min, max, *, out=None) → Tensor Clamp all elements in input into the range [ min, max ].	Binomial distribution class encounters overflow when logits are large. Note: the binomial distribution is parametrized by logits	statistical distributions	statistical distributions	Binomial distribution, log probability	rewrite math formula	rewrite math formula	Rewrite equation for log_prob method and use a custom clamp function on logits to ensure they are of certain value. The custom clamp function works like torch.clamp, except for that it returns 0.5 when gradient = 0 and value = 0	- return (log_factorial_n - log_factorial_k - log_factorial_nmk + - value * self.logits - self.total_count * torch.log1p(self.logits.exp()))	def _clamp_by_zero(x): + # works like clamp(x, min=0) but has grad at 0 is 0.5 + return (x.clamp(min=0) + x - x.clamp(max=0)) / 2 + normalize_term = (self.total_count * _clamp_by_zero(self.logits) + + self.total_count * torch.log1p(torch.exp(-torch.abs(self.logits))) + - log_factorial_n) + return value * self.logits - log_factorial_k - log_factorial_nmk - normalize_term	def test_binomial_stable(self): + logits = torch.tensor([-100., 100.], dtype=torch.float) + total_count = 1. + x = torch.tensor([0., 0.], dtype=torch.float) + log_prob = Binomial(total_count, logits=logits).log_prob(x) + self.assertTrue(torch.isfinite(log_prob).all()) + + # make sure that the grad at logits=0, value=0 is 0.5 + x = torch.tensor(0., requires_grad=True) + y = Binomial(total_count, logits=x).log_prob(torch.tensor(0.)) + self.assertEqual(grad(y, x)[0], torch.tensor(-0.5))
22	PyTorch	3dcc329746223bc24f8213ccbaa5eba09273e162	C++	Fix	loss of precision	Inaccurate result	Inexact	Summation of numbers should be performed from smallest to largest to avoid loss of significant digits	Loss of precision and floating point truncation in summation formula. Summing many floating point values can lead to loss in precision if the values are different orders of magnitude.	tensor math	tensor math	summation, tensor math	use a different algorithm	use a different algorithm	Use a tree based approach where items of similar orders of magnitude are summed together to avoid numerical instability. This algorithm does the summation along a single axis with multiple "levels" of accumulator, each of which is designed to hold the sum of an order of magnitude more values than the previous. e.g. if there are 2^16 elements, the first level will hold the sum of 2^4 elements, and so on in increasing powers of 2: 2^4, 2^8, 2^12 and finally 2^16. This limits the differences in magnitude of the partial results being added together, and so we don't lose accuracy as the axis length increases.		A simplified recursive implementation would look like this: + + scalar_t row_sum(const scalar_t * data, int64_t n) { + // Note, in practice the chunk size can increase with n + // This allows the recursion depth to be limited to O(1). + constexpr int64_t min_chunk_size = 16; + + scalar_t sum = 0; + if (n <= min_chunk_size) { + // Recursive base case, calculate a simple running sum + for (int64_t i = 0; i < n; ++i) { + sum += data[i]; + } + return sum; + } + + // Recursively sum larger chunks of elements + const int64_t chunk_size = std::max(divup(n, min_chunk_size), min_chunk_size); + for (int64_t i = 0; i < n; i += chunk_size) { + sum += row_sum(data + i, std::min(chunk_size, n - i)); + } + return sum; + }	ASSERT_NEAR(norm_after, max_norm, 1e-6);		sum
23	PyTorch	d16c8238e164c6499714de625eb73422382e5ec1	Python	Fix	overflow/underflow	Inaccurate result, NaN	overflow, underflow, inexact	Softmax function turns a vector of K real values into a vector of K real values that sum to 1. The input values can be positive, negative, zero, or greater than one, but the softmax transforms them into values between 0 and 1, so that they can be interpreted as probabilities. I	Implementation of softmax for certain cases (when the dim argument of softmax and axis do not equal to ndim - 1, where ndim - 1 = the last dimension) is numerically unstable. Large inputs into the exponential function will produce infinity and output of softmax becomes NaN.	activation functions	activation functions	softmax	use a different algorithm	use a different algorithm	Transpose input to allow for using ONNX softmax numerically stable implementation				softmax	exp
24	PyTorch	b403b10ff98a6bc1a238e7ba4eee6393b6b89048	C++	Fix	loss of precision	categorical cross entropy yields inacurate result	Inexact		When a small float is subtracted from a large float, the large float is not changing in value (as it should mathematically). logsoftmax not working for large logits, as a result nn.CrossEntropyLoss() yields incorrect results for big logits	loss functions	loss functions	logsoftmax, cross entropy loss	rewrite math formula	rewrite math formula	Rewrite formula considering maximum input. If we add a very small number to a large one, the small number will be ignored. Example: tmpsum = 1e8 + log(2) = 1e8. Numerically with float precision the log(2) is ignored so at the end we basically have 1e8 - (1e8 + log(2)) = 0 instead of -log(2).	[tmp_sum](Vec x) { return x - Vec(tmp_sum); }	[tmp_sum](Vec x) { return x - Vec(tmp_sum); }	def test_log_softmax(self): + x_small = torch.ones(1, 2, dtype=torch.float32) + x_big = x_small + 1e16 + self.assertEqual(F.log_softmax(x_small, -1), F.log_softmax(x_big, -1))	log softmax	subtaction
25	PyTorch	f8cab38578a99ad04d23256c2da877db4814f76f	Python	Fix	invalid operation		?	Only a positive definite matrix has a unique Cholesky factorization A = RTR, where R is upper triangular with positive diagonal elements. A positive definite matrix = symmetric matrix with all positive eigenvalues. Cholesky decomposition is roughly twice as efficient as the LU decomposition for solving systems of linear equations. A = RTR, R is called the Cholesky factor of A.	Matrix inverse triggers a cholesky error, because the matrix is not positive definite. Also, matrix inverse can cause numerical instability.	statistical distributions	statistical distributions	Gaussian distribution	rewrite math formula	rewrite math formula	only take inverse of a triangular matrix		def _precision_to_scale_tril(P): + # Ref: https://nbviewer.jupyter.org/gist/fehiepsi/5ef8e09e61604f10607380467eb82006#Precision-to-scale_tril + Lf = torch.cholesky(torch.flip(P, (-2, -1))) + L_inv = torch.transpose(torch.flip(Lf, (-2, -1)), -2, -1) + L = torch.triangular_solve(torch.eye(P.shape[-1], dtype=P.dtype, device=P.device), + L_inv, upper=False)[0] + return L			matrix inverse
26	PyTorch	c1790fa202f30e3aca1d1ecb31f26e0b3bb1e69f	Cuda, C++	Fix	loss of precision			linear interpolation is a method of curve fitting using linear polynomials to construct new data points within the range of a discrete set of known data points.Parameters of lerp: a, b, t, Output: a + t (b-a), The parameter t defines where to estimate the value on the interpolated line, it is 0 at the first point and 1 and the second point. For interpolated values between the two points mu ranges between 0 and 1. https://en.wikipedia.org/wiki/Linear_interpolation#Programming_language_support	unstable formula for linear interpolation	tensor math	linear interpolation	linear interpolation	rewrite math formula	rewrite math formula	rewrite formula // Imprecise method, which does not guarantee v = v1 when t = 1, due to floating-point arithmetic error. This method is monotonic // This form may be used when the hardware has a native fused multiply-add instruction. float lerp(float v0, float v1, float t) { return v0 + t * (v1 - v0); } // Precise method, which guarantees v = v1 when t = 1. This method is monotonic only when v0 * v1 < 0. Lerping between same values might not produce the same value float lerp(float v0, float v1, float t) { return (1 - t) * v0 + t * v1; }	ret_val = self_val + weight_val * (end_val - self_val);	ret_val = (weight_val < 0.5) ? self_val + weight_val * (end_val - self_val) : end_val - (end_val - self_val) * (1 - weight_val);		a + t (b-a)	linear interpolation
27	PyTorch	e17b8dea1dd30bef55b314b0217f79ce22a13cf9	C++	Fix	overflow			In C and C++, integer literals are interpreted as an `int` type unless specified otherwise by using a trailing L for long and LL for long long, e.g., 42 is an int, 42L is a long, 42LL is a long long. On x86_64 systems using GNU toolchains on Linux, this is 32, 64, and 64 bits respectively. $ cat long.c #include <stdio.h> int main(void) { printf("sizeof(42) = %lu\n", sizeof(42)); printf("sizeof(42L) = %lu\n", sizeof(42L)); printf("sizeof(42LL) = %lu\n", sizeof(42LL)); return 0; } (py38) kyle@fulltower:~ $ gcc long.c (py38) kyle@fulltower:~ $ ./a.out sizeof(42) = 4 sizeof(42L) = 8 sizeof(42LL) = 8	Accumulator is overflowing because the starting value of accumulation is too small of a type (int) to accomodate the size of inputs that are common in Pytorch. Calculation of number of elements (e.g.: number of batches) overflows, because return type does not have enough precision to hold the result.	linear algebra	linear algebra	linear algebra, distance	increase variable precision/change variable type	change variable type	Use 64 bit type for accumulator.	- int64_t numel = std::accumulate(oldshape.begin(), oldshape.end(), 1, - std::multiplies<int64_t>());	const int64_t numel = prod_intlist(oldshape);
28	PyTorch	56840f0a81e4460089740d50d3768f37e79a17fc	Cuda	Fix	overflow			In binary search, the variables used to represent the indices will often be of fixed size (integers), and this can result in an arithmetic overflow for very large arrays. If the midpoint of the span is calculated as (L+R)/2, then the value of L+R may exceed the range of integers of the data type used to store the midpoint, even if L and R are within the range. If L and R are nonnegative, this can be avoided by calculating the midpoint as L+ ((R-L)/2) Bucketize bucketizes 'input' based on 'boundaries'. Summary For example, if the inputs are boundaries = [0, 10, 100] input = [[-5, 10000] [150, 10] [5, 100]] then the output will be output = [[0, 3] [3, 2] [1, 3]]	Possible overflow when adding two 32 bit ints in binary search algorithm when calculating the midpoint	other	bucketize	binary search, bucketize operation	rewrite math formula	rewrite math formula	By first subtracting low from high, this assures that this intermediate calculation will not overflow its 32 bit datatype.	int32_t median = (high + low) / 2;	const int32_t median = low + (high - low) / 2;	int32_t mp1(int32_t a, int32_t b){ return (a+b)/2; } int32_t mp2(int32_t a, int32_t b){ return a+(b-a)/2; } int main(){ int32_t low=-1; for(int32_t high=1;high<10000;high++){ if(mp1(low,high)!=mp2(low,high)){ std::cout<<"Ahhhh!"<<std::endl; } } }
29	PyTorch	7f42d1c98a72855806bd35ef27ce6823837e0816	C++	Fix	loss of precision			Python "floats" are actually doubles internally	Originally a float was used, which has less precision than a double. JIT only supports double, not float. So when insertConstant, we need to cast the python `float_` to double instead of float. This will fix the incorrect `math.pi` and other high precision constants value.	other	other	python bindings from C++, low level math, constants	increase variable precision/change variable type	increase variable precision	When converting a Pyobject representation into a C++ representation, use a double instead of a float	return toSimple(g.insertConstant(py::cast<float>(obj), loc));	return toSimple(g.insertConstant(py::cast<double>(obj), loc));
30	PyTorch	c784f847debc6f6a30b41da6853517b2ccd3ddf0	C++	Fix	overflow			int is 32 bits on amd64/Linux/GNU. sizes and indexes should use size_t in order to use the word size of the current platform, which allows one to index as many elements as could possibly fit into memory.	sparse_adagrad param_size overflow error	optimizers	optimizers	adagrad optimizer	increase variable precision/change variable type	increase variable precision	Correctly replace the data type of a size from "int" to "size_t"	int param_size uint64_t idx_pref = indices[i_pref];	size_t param_size auto idx_pref = indices[i_pref];
31	PyTorch	76c1b5cd794c44e4fec8da1d87ec8f0ccc045e68	C++	Fix	overflow			Std::numeric_limits = way to query various properties of arithmetic types	Reusing a variable who's data type (precision) depends on the template argument. Bug: caffe2/caffe2/operators/stats_put_ops.h:66:25: runtime error: 9.22337e+18 is outside the range of representable values of type 'long' . The assignment from int64_t to float loses some precision and because of that we overflow	other	external library	Caffe operators	increase variable precision/change variable type	increase variable precision	increase precision computation to int 64 and as opposed to converting to int64_t at the end from float add overflow safeguard using std::numeric_limits	- input = 0; - } else if (input < -bound_value) { - input = -bound_value; - } else if (input > bound_value) { - input = bound_value; - int64_t int_value = input * magnitude_expand_;	int_value = 0; + } else if (input <= -bound_value) { + int_value = std::numeric_limits<int64_t>::min(); + } else if (input >= bound_value) { + int_value = std::numeric_limits<int64_t>::max(); + } else { + int_value = input * magnitude_expand_; } } else { CAFFE_ENFORCE( std::abs(static_cast<int64_t>(input)) < bound_value, "Input value is too large for the given magnitude expansion!"); CAFFE_ENFORCE(!isNan(input), "Input value cannot be NaN!"); + int_value = input * magnitude_expand_; }	def test_clamp_with_out_of_bounds(self): + put_value = float(1e20) + magnitude_expand = 1000000000000 + stat_name = "stat".encode('ascii') + sum_postfix = "/stat_value/sum".encode("ascii") + count_postfix = "/stat_value/count".encode("ascii") + + workspace.FeedBlob("value", np.array([put_value], dtype=np.float)) + + workspace.RunOperatorOnce(core.CreateOperator( + "AveragePut", + "value", + [], + stat_name=stat_name, + magnitude_expand=magnitude_expand, + bound=True)) + + workspace.RunOperatorOnce(core.CreateOperator( + 'StatRegistryExport', [], ['k', 'v', 't'])) + + k = workspace.FetchBlob('k') + v = workspace.FetchBlob('v') + + stat_dict = dict(zip(k, v)) + + self.assertIn(stat_name + sum_postfix, stat_dict) + self.assertIn(stat_name + count_postfix, stat_dict) + self.assertEquals(stat_dict[stat_name + sum_postfix], + 9223372036854775807) self.assertEquals(stat_dict[stat_name + count_postfix], 1)
32	PyTorch	08b1324ec26043b1acfaf4b65335c671c8658a3c	C	Fix	overflow				integer overflow in remainder operator	tensor math	tensor math	tensor math, remainder operator	rewrite math formula	rewrite math formula, add overflow check	The sign of the result of modulo should be the same as the denominator. This commit checks that those signs have not flipped, which would indicate an overflow. There is a bug, however, because signed overflow is undefined behavior in C, and therefore the compiler is allowed to emit any machine code for this. A compiler upgrade may break this code, or more likely, not emit machine code for this condition check, since a signed integer cannot overflow as per the standard, and thus the compiler can ignore that as a condition.	TensorRemainderOp(T v) : val(v) {} __device__ __forceinline__ void operator()(T* out, T* in) { out = in % val; - if ((out val) < 0){ *out += val; }	static inline bool has_different_sign(real a, real b) { + return (a < 0) != (b < 0); +} TensorRemainderOp(T v) : val(v) {} __device__ __forceinline__ void operator()(T* out, T* in) { out = in % val; + if (has_different_sign<T>(out, val)){ out += val; }	def _test_remainder_overflow(self, dtype=torch.int64): + # Check Integer Overflows + x = torch.tensor(23500, dtype=dtype) + q = 392486996410368 + self.assertEqual(x % q, x) + self.assertEqual(-x % q, q - x) + self.assertEqual(x % -q, x - q) + self.assertEqual(-x % -q, -x) + + def test_remainder_overflow(self): + self._test_remainder_overflow(self, dtype=torch.int64) For CUDA: + def test_remainder_overflow(self): + TestTorch._test_remainder_overflow(self, dtype=torch.cuda.int64)
33	PyTorch	6185b27cc6645d8055b76f9cc330b010d1c2a258	C++	Fix	loss of precision			Standard_gamma_grad computes the reparameterized gradient -(d/dalpha cdf(x;alpha)) / pdf(x;alpha) for random number x drawn from a standard Gamma distribution Gamma(alpha) standard_gamma_grad_one(scalar alpha, scalar x)	low precision of gamma distribution gradient	statistical distributions	statistical distributions	gradients, gamma distribution	use a different algorithm	use a different algorithm	Use Taylor series expansion and Rice saddle point expansion instead of asymptotic approximation for caluclating gamma distribution gradient. In particular, use a Taylor series expansion for small x and a Rice saddle point expansion for large alpha.
34	PyTorch	c43b120d4329dbcbed114eae8b4cfb23f11b3779	C	Fix	loss of precision			linspace operation creates a one-dimensional tensor of size steps whose values are evenly spaced from start to end, inclusive.	low float precision in linear approximation operation	tensor math	tensor math	linspace (i.e.: 1D tensor creation)	rewrite math formula	rewrite math formula	reverse order of multiplication and division. Move variable i from numerator to denominator. However, this formula would yield a different result.	r__data = a + i(b-a)/((real)(n-1));	r__data = a + (b-a)/((real)(n-1))i;		start, start + (end-start)/(steps-1), ..., start + (steps - 2) * (end-start)/(steps-1)	multiply, divide, add, subtract
35	PyTorch	415658836538d69362ed5482dc5fbfdba39a1c69	C++	Unit test	hardware	result of log is sligtly different on different hardware platforms	Inexact	Logarithms are easy to compute in some cases, such as log10(1000) = 3. In general, logarithms can be calculated using power series or the arithmetic–geometric mean, or be retrieved from a precalculated logarithm table that provides a fixed precision.[	Log approximation is not bitwise identical on different hardware platforms. Different processors (in this case broadwell vs skylake) can have different behavoir when it comes to floating point operations. Log is implemented in software, so different hardware platforms using the same software stack may be using the same algorithm to calculate log, but the primitive operations used for floating point operations can be different between different FPU implementations.	tensor math	tensor math	tesing output accuracy, log approximation, hardware, tensor math	relax accuracy test tolerance	relax accuracy test tolerance	Rather than asserting bit by bit perfect match, instead compare with a tollerence of 32 bit floating point epsilon. Epsilon is the smallest number such that when added to the floating point number 1.0, yields a value greater than 1.0. Allow 1 ULP (unit in place) tolerance by allowing for epsilon relative tolerance error. Epsilon is defined using C++ standard library's numeric_limits for float, which returns the machine epsilon, that is, the difference between 1.0 and the next value representable by the floating-point type T.	// Results should be bit-identical. ASSERT_TRUE( memcmp( B_ref.data_ptr<float>(), B_t.data_ptr<float>(), B_ref.nbytes()) == 0);	// Results should be bit-identical. ASSERT_TRUE(torch::allclose( B_t, B_ref, /rtol=/eps, /atol=/0.0f, /equal_nan=/true)) << "Input[:8]\n" << A_t.index({Slice(0, 8)}) << "\n" << "Test[:8]\n" << B_t.index({Slice(0, 8)}) << "\n" << "Ref[:8]\n" << B_ref.index({Slice(0, 8)}) << diffs(B_t, B_ref);	N/A	log	log approximation
36	PyTorch	2e35fe953553247d8a22fc38b039374e426f13b8	C++	Speed optimization	inefficient algorithm	low speed of model training	N/A	variational maximum likelihood (VML) is a parametric statistical estimation techniques. VML (Beal, 2003) also referred to as (variational) expectation-maximization (McLachlan and Krishnan, 2007; Barber, 2012), can be considered a semi-Bayesian estimation approach. VML rests on a decomposition of the log marginal likelihood	FPU only has 1 divider, so FP division operations are slow.	tensor math	tensor math	log approximation, tensor math	use a different algorithm	use a different algorithm	Implement variational maximum likelihood for log approximation, which will be faster. It increase speed by keeping floating point units busy by avoiding division operations to to allow for better instruction level pararellism. Use a power series using log vml instead of sleef.	N/A		// Generate every single-precision FP value in [1.0, 2.0). + auto eps = std::numeric_limits<float>::epsilon(); + at::Tensor A_t = torch::arange(1.0f, 2.0f, eps); + ASSERT_EQ(A_t.numel(), 1 << 23); + + test(A_t); + + test(A_t * 2.0f); + test(A_t * 0.5f); + + test(A_t * 4.0f); + test(A_t * 0.25f); + + test(A_t * powf(2.0f, 16)); + test(A_t * powf(2.0f, -16)); + + test(A_t * powf(2.0f, 126)); + test(A_t * powf(2.0f, -126)); + + test(torch::full({32}, INFINITY)); + test(torch::full({32}, NAN)); + + auto min = std::numeric_limits<float>::min(); + auto denorm_min = std::numeric_limits<float>::denorm_min(); + + // Denormals aren't bit precise, because sleef isn't bit-precise either. + A_t = torch::arange(0.0f, min, denorm_min); + ASSERT_EQ(A_t.numel(), 1 << 23); + auto B_ref = at::log(A_t); + auto B_t = at::empty_like(B_ref); + cg.call({A_t.data_ptr<float>(), B_t.data_ptr<float>(), A_t.numel()}); + ASSERT_TRUE(torch::allclose(B_t, B_ref)); +}	log	log
37	PyTorch	1047957831e2ef68d60af90865187e46ba6e5e86	C++	Speed optimization	inefficient algorithm	low speed of model training	N/A	SLEEF stands for SIMD Library for Evaluating Elementary Functions. It implements manually vectorized versions of all C99 real floating point math functions. It can utilize SIMD instructions that are available on modern processors. SLEEF is designed to efficiently perform computation with SIMD instructions by reducing the use of conditional branches and scatter/gather memory access.	Log can be slow to compute, an optimized algorithm can help.	tensor math	tensor math	log approximation, tensor math	use a different algorithm	use a different algorithm	add log approximation based on SLEEF.	N/A				log
38	PyTorch	2572d7a67123fdccef8979520be335c95605cf82	Python	Unit test	loss of precision	Inaccurate result	Inexact	PyTorch provides two different modes of quantization: Eager Mode Quantization and FX Graph Mode Quantization. Eager Mode Quantization is a beta feature. User needs to do fusion and specify where quantization and dequantization happens manually, also it only supports modules and not functionals. https://pytorch.org/docs/stable/quantization.html quantization aware training (weights quantized, activations quantized, quantization numerics modeled during training)	Needed a unit test for leaky relu in quantization aware training	quantization	quantization	testing, quantization, leaky relu, eger mode quantization, qat conversion, quantization aware training	add test/warning	add precision test	Add numerical test for conversion in qat (Quantization-aware training) for leaky relu	def _test_activation_impl( self, float_module, float_op, quantized_module, quantized_op): ''' Test for activation op(with inplace options), float_op can be torch op or functional op ''' class M(torch.nn.Module): def __init__(self, is_module, inplace): super(M, self).__init__() self.is_module = is_module self.inplace = inplace if self.is_module: self.op = float_module(self.inplace) else: self.op = float_op def forward(self, input): if self.is_module: return self.op(input) else: return self.op(input, self.inplace) options = itertools.product([True, False], [True, False], self.static_quant_types) quantized_nodes = { # is_module True: ns.call_module(quantized_module), False: ns.call_function(quantized_op), } for is_module, is_inplace, quant_type in options: self.checkGraphModeFxOp( M(is_module, is_inplace), self.img_data_2d, quant_type, quantized_nodes[is_module])	class TestEagerModeQATOps(QuantizationTestCase): + def _test_activation_convert_numerics_impl(self, Act, data): class M(torch.nn.Module): def __init__(self): super().__init__() @@ -1321,6 +1321,10 @@ class TestEagerModeQATOps(QuantizationTestCase): m = convert(m) checkNoFQModule(m) class TestQATActivationOps(QuantizationTestCase): def _test_activation_convert_numerics_impl(self, Act, data): class M(torch.nn.Module): def __init__(self): super().__init__() self.act = Act() self.quant = QuantStub() self.dequant = DeQuantStub() def forward(self, x): x = self.quant(x) x = self.act(x) x = self.dequant(x) return x m = M().train() m.qconfig = default_qat_qconfig m = prepare_qat(m) before_convert = m(data) m = convert(m) after_convert = m(data) self.assertEqual(before_convert, after_convert) + def test_leaky_relu(self): + data = torch.randn(1, 3, 2, 4) + self._test_activation_convert_numerics_impl(nn.LeakyReLU, data)		LeakyReLU(x)=max(0,x)+negative_slope∗min(0,x)	leaky relu
39	PyTorch	c9a8413306312b2f2789dd46d5ac1a947be6b556	Cuda	Fix	loss of precision	NaN, Inf gradients		Creating and using character or word embeddings is the mainstream approach for handling most of the NLP tasks. Each character/word is matched with a numeric vector to create a numerical vector representation of text, which can be input into a model.	Intermediate calculations were done on the same type as the output, in the case of float 16 this can lead to loss of precision. During FP16 training, char_embeddings.weight get NAN or INF gradients	other	NLP	backward pass, character embedding, NLP	increase variable precision/change variable type	increase variable precision	use higher precision for a variable that holds intermediate result, use a `float32` temporary tensor when the input is `float16`
40	PyTorch	699de487db9f2cb6de5cba9588311eed46a8ccb3	C++	New feature	N/A			trapezoidal rule for integration is an approximation technique for calculating area under a curve based on summing trapezoids under a curve The estimated integral of a function y of x, sampled at points (y_1, ..., y_n) that are separated by distance (dx_1, ..., dx_{n-1}), is given by the trapezoid rule: sum_{i=1}^{n-1} dx_i * (y_i + y_{i+1}) / 2	N/A	other	integration	integration	other	add new algorithm	add numerical integration based on trapeizoidal rule that matches numpy implementation	N/A	Tensor do_trapz(const Tensor& y, const Tensor& dx, int64_t dim) { + Tensor left = y.slice(dim, 0, -1); + Tensor right = y.slice(dim, 1); + + return ((left + right) * dx).sum(dim) / 2.; +} + +// When dx is constant, the above formula simplifies +// to dx * [(\sum_{i=1}^n y_i) - (y_1 + y_n)/2] +Tensor do_trapz(const Tensor& y, double dx, int64_t dim) { + return (y.sum(dim) - (y.select(dim, 0) + y.select(dim, -1)) * (0.5)) * dx; +} + +Tensor zeros_like_except(const Tensor& y, int64_t dim) { + auto sizes = y.sizes().vec(); + dim = maybe_wrap_dim(dim, y.dim()); + sizes.erase(sizes.begin() + dim); + return at::zeros(sizes, y.options()); +} + +} + +Tensor trapz(const Tensor& y, const Tensor& x, int64_t dim) { + dim = maybe_wrap_dim(dim, y); + // asking for the integral with zero samples is a bit nonsensical, + // but we'll return "0" to match numpy behavior. + if (y.size(dim) == 0) { + return zeros_like_except(y, dim); + } + Tensor x_viewed; + if (x.dim() == 1) { + TORCH_CHECK(x.size(0) == y.size(dim), "trapz: There must be one `x` value for each sample point"); + DimVector sizes(y.dim(), 1); + sizes[dim] = x.size(0); + x_viewed = x.view(sizes); + } else { + x_viewed = x; + } + Tensor x_left = x_viewed.slice(dim, 0, -1); + Tensor x_right = x_viewed.slice(dim, 1); + + Tensor dx = x_right - x_left; + return do_trapz(y, dx, dim); +} + +Tensor trapz(const Tensor& y, double dx, int64_t dim) { + // see above + if (y.size(dim) == 0) { + return zeros_like_except(y, dim); + } + return do_trapz(y, dx, dim);	def test_trapz(self): + f_args_variable = (torch.randn(2, 3, requires_grad=True), + torch.tensor([[1.0, 2.0, 5.5], [2.3, 0.5, 6.2]], requires_grad=True)) + f_args_tensor = deepcopy(unpack_variables(f_args_variable)) + run_functional_checks(self, "test_trapz", "trapz", + lambda y, x: torch.trapz(y, x), + True, f_args_variable, f_args_tensor) @unittest.skipIf(not TEST_NUMPY, "Numpy not found") + def test_trapz(self): + def test_dx(sizes, dim, dx, device): + t = torch.randn(sizes, device=device) + actual = torch.trapz(t, dx=dx, dim=dim) + expected = np.trapz(t.cpu().numpy(), dx=dx, axis=dim) + self.assertEqual(expected.shape, actual.shape) + self.assertTrue(np.allclose(expected, actual.cpu().numpy())) + + def test_x(sizes, dim, x, device): + t = torch.randn(sizes, device=device) + actual = torch.trapz(t, x=torch.tensor(x, device=device), dim=dim) + expected = np.trapz(t.cpu().numpy(), x=x, axis=dim) + self.assertEqual(expected.shape, actual.shape) + self.assertTrue(np.allclose(expected, actual.cpu().numpy())) + + for device in torch.testing.get_all_device_types(): + test_dx((2, 3, 4), 1, 1, device) + test_dx((10, 2), 0, 0.1, device) + test_dx((1, 10), 0, 2.3, device) + test_dx((0, 2), 0, 1.0, device) + test_dx((0, 2), 1, 1.0, device) + test_x((2, 3, 4), 1, [1.0, 2.0, 3.0], device) + test_x((10, 2), 0, [2.0, 3.0, 4.0, 7.0, 11.0, 14.0, 22.0, 26.0, 26.1, 30.3], device) + test_x((1, 10), 0, [1.0], device) + test_x((0, 2), 0, [], device) + test_x((0, 2), 1, [1.0, 2.0], device) + with self.assertRaisesRegex( + IndexError, + 'Dimension out of range'): + test_x((2, 3), 2, [], device) + test_dx((2, 3), 2, 1.0, device) + with self.assertRaisesRegex( + RuntimeError, + 'There must be one `x` value for each sample point'): + test_x((2, 3), 1, [1.0, 2.0], device) + test_x((2, 3), 1, [1.0, 2.0, 3.0, 4.0], device)	y = 1/(1+exp(-x)), x = logit(y)	integration
41	PyTorch	c5d5d45f40969cbddbb7f87da343dfd422503c1c	Python	Fix	overflow/underflow		overflow/underflow	The absolute value of the Jacobian determinant at p gives us the factor by which the function f expands or shrinks volumes near p; this is why it occurs in the general substitution rule. The Jacobian determinant is used when making a change of variables when evaluating a multiple integral of a function over a region within its domain. According to the inverse function theorem, the matrix inverse of the Jacobian matrix of an invertible function is the Jacobian matrix of the inverse function.	The absolute determinant of the Jacobian of the inverse transformation in sigmoid transformation is unstable and returns NaN	statistical distributions	statistical distributions	Log absolute determinant Jacobian, distribution transformation	rewrite math formula	rewrite math formula	Rewrite method log abs det jacobian	-(y.reciprocal() + (1 - y).reciprocal()).log()	-F.softplus(-x) - F.softplus(x)			reciprocal
42	PyTorch	645ad7ad0c89ecef61e89666745324deba31c8b7	Python	Fix	underflow	NaN		LP is LP space: At p = 1, one gets Sum Pooling (which is proportional to average pooling), p = inf is max pooling	Gradient in LP pooling 1D and 2D becomes Nan when all inputs are zero. If all inputs are zero then the sum of all x to the power of p is zero. Square root of zero = NaN	CNN operations	pooling layer	LP pooling	rewrite math formula	rewrite math formula	Add relu unit to LP pooling to avoid gradient = NaN. After adding this patch gradient will be set to zero as opposed to NaN.	return out.mul(kw * kh).pow(1. / norm_type)	return (torch.sign(out) * relu(torch.abs(out))).mul(kw * kh).pow(1. / norm_type)		pth root of sum of x^p	pth root of sum of polynomials
43	PyTorch	de42542351ad933ada59a4a8cf3b247d75d52917	Python	Fix	loss of precision			Precision matrix (also known as concentration matrix) is the matrix inverse of the covariance matrix. The multivariate normal distribution can be parametrized either by the covariance matrix or precision matrix.	precision matrix computation in multivariate normal distribution is unstable due to matrix inverse	statistical distributions	statistical distributions	distributions, precision matrix, multivariante normal distribution	rewrite math formula	rewrite math formula	Prior computation for precision matrix which uses the inverse of covariance matrix. Compute precision matrix with scale_tril instead. scale_tril is lower-triangular k x k matrix with non-zero diagonal,	- flat_conv = self.covariance_matrix.reshape((-1,) + self._event_shape * 2) - flat_precision = torch.stack([C.inverse() for C in flat_conv], 0)	scale_tril_inv = _batch_inverse(self.scale_tril) + flat_scale_tril_inv = self.scale_tril.reshape((-1,) + self._event_shape * 2) + flat_precision = torch.bmm(flat_scale_tril_inv.transpose(-1, -2), + flat_scale_tril_inv)			matrix inverse
44	PyTorch	8cff8e93d21142ff42b9d2b1f45b01acde0b9d99	Python	Fix	loss of precision	NaN		Pytorch no longer has from torch.distributions.utils import _finfo, there is now torch.finfo A torch.finfo is an object that represents the numerical properties of a floating point torch.dtype, (i.e. torch.float32, torch.float64, and torch.float16). This is similar to numpy.finfo.	Need a function for checking numerical properties of variables and calculating epsilon, which is used, for example, in softmax. Different floating point types have different characteristics with regards to their precision, what is the smallest positive number they can represent, what is the smallest number that can be added to one without truncation, etc.	statistical distributions	statistical distributions	distributions (Laplace, Gumbel, Gamma, Dirichlet)	use a different algorithm	use a different algorithm	Pytorch has many differenet datatypes with varying degress of precision. _finfo allows one to get the information about the charactaristics such as smallest number that can be added to 1 without truncation (eps), and the smallest positive number greater than zero (tiny) for each type of float. The newly implemented _finfo is used to clamp the Gamma, Beta, and Dirichlet distributions to avoid NANs.	def _get_clamping_buffer(tensor): - clamp_eps = 1e-6 - if isinstance(tensor, Variable): - tensor = tensor.data - if isinstance(tensor, (torch.DoubleTensor, torch.cuda.DoubleTensor)): - clamp_eps = 1e-15 - return clamp_eps eps = _get_clamping_buffer(probs)	# This follows semantics of numpy.finfo. +_Finfo = namedtuple('_Finfo', ['eps', 'tiny']) +_FINFO = { + torch.HalfStorage: _Finfo(eps=0.00097656, tiny=6.1035e-05), + torch.FloatStorage: _Finfo(eps=1.19209e-07, tiny=1.17549e-38), + torch.DoubleStorage: _Finfo(eps=2.22044604925e-16, tiny=2.22507385851e-308), + torch.cuda.HalfStorage: _Finfo(eps=0.00097656, tiny=6.1035e-05), + torch.cuda.FloatStorage: _Finfo(eps=1.19209e-07, tiny=1.17549e-38), + torch.cuda.DoubleStorage: _Finfo(eps=2.22044604925e-16, tiny=2.22507385851e-308), +} _finfo doc comment: def _finfo(tensor): """ Return floating point info about a `Tensor` or `Variable`: - `.eps` is the smallest number that can be added to 1 without being lost. - `.tiny` is the smallest positive number greater than zero (much smaller than `.eps`). Args: tensor (Tensor or Variable): tensor or variable of floating point data. Returns: _Finfo: a `namedtuple` with fields `.eps` and `.tiny`. """ eps = _finfo(probs).eps
45	PyTorch	bc505100167f61ce241f511741794dfe2f89c5f0	Python	Fix	loss of precision			Logit is the natural logarithm of odds, which is defined as p / (1-p), where p is probability. Probabilities range from zero to one, i.e., p∈[0,1], whereas logits can be any real number (R, from minus infinity to infinity)	numerical stability of linspace implementation	loss functions	loss functions	loss, caffe2, batch lr loss	use a different algorithm	use a different algorithm	Delete code that uses probability, use only logits in batch lr loss	- if schema.is_schema_subset( - schema.Struct( - ('label', schema.Scalar()), - ('logit', schema.Scalar()) - ), self.input_record - ): - label = self.input_record.label() - # mandatory cast to float32 - # self.input_record.label.field_type().base is np.float32 but - # label type is actually int - label = net.Cast( - label, - net.NextScopedBlob('label_float32'), - to=core.DataType.FLOAT) - label = net.ExpandDims(label, net.NextScopedBlob('expanded_label'), - dims=[1]) - xent = net.SigmoidCrossEntropyWithLogits( - [self.input_record.logit(), label], - net.NextScopedBlob('cross_entropy'), - ) - # TODO(T23937449): Change all the use cases of BatchLRLoss to the - # numerically stable version - else: - class_probabilities = net.MakeTwoClass( - self.input_record.prediction.field_blobs(), - net.NextScopedBlob('two_class_predictions') - ) - label = self.input_record.label.field_blobs() - label = [net.Cast( - label, - net.NextScopedBlob('int32_label'), - to=core.DataType.INT32)] - xent = net.LabelCrossEntropy( - [class_probabilities] + label, - net.NextScopedBlob('cross_entropy'), - )	label = self.input_record.label() + # mandatory cast to float32 + # self.input_record.label.field_type().base is np.float32 but + # label type is actually int + label = net.Cast( + label, + net.NextScopedBlob('label_float32'), + to=core.DataType.FLOAT) + label = net.ExpandDims(label, net.NextScopedBlob('expanded_label'), + dims=[1]) + xent = net.SigmoidCrossEntropyWithLogits( + [self.input_record.logit(), label], + net.NextScopedBlob('cross_entropy'), + )		logit = ln(p/(1-p))	ln
46	PyTorch	40b783b746b4f5775c97c7fe41dfb011b545665a	Python	Unit test	loss of precision			A simple approximation of the first derivative is f'(x) ~ (f(x+h)-f(x))/h, where h is the steps size.	Unit test failing because of numerical approximation of derivative (i.e.: the gradient) of pReLU uses step size that is too large, which causes a large approximation error.	activation functions	activation functions	testing accuracy, gradients, caffe2, pReLU	rewrite math formula	rewrite math formula	Improve test for gradient checks asserts by using smaller step size	self.assertGradientChecks(gc, op, [X, W], 0, [0])	self.assertGradientChecks(gc, op, [X, W], 0, [0], stepsize=1e-2)	def test_prelu(self, X, alpha, inplace, shared, order, seed, gc, dc): np.random.seed(seed) W = np.random.randn( X.shape[1] if order == "NCHW" else X.shape[3]).astype(np.float32) if shared: W = np.random.randn(1).astype(np.float32) # go away from the origin point to avoid kink problems X += 0.04 * np.sign(X) X[X == 0.0] += 0.04 def prelu_ref(X, W): Y = X.copy() W = W.reshape(1, -1, 1, 1) if order == "NCHW" \ else W.reshape(1, 1, 1, -1) assert len(X.shape) == 4 neg_indices = X <= 0 assert len(neg_indices.shape) == 4 assert X.shape == neg_indices.shape Y[neg_indices] = (Y * W)[neg_indices] return (Y,) op = core.CreateOperator( "PRelu", ["X", "W"], ["Y" if not inplace else "X"], alpha=alpha, order=order) self.assertReferenceChecks(gc, op, [X, W], prelu_ref, ensure_outputs_are_inferred=True) # Check over multiple devices self.assertDeviceChecks(dc, op, [X, W], [0]) if not inplace: # Gradient check wrt X self.assertGradientChecks(gc, op, [X, W], 0, [0], stepsize=1e-2, ensure_outputs_are_inferred=True) # Gradient check wrt W self.assertGradientChecks(gc, op, [X, W], 1, [0], stepsize=1e-2, ensure_outputs_are_inferred=True)	PReLU(x)=max(0,x)+a∗min(0,x)	pReLU
47	PyTorch	e187ba7a9fb18aba0a0651e05c20e1f491d989fc	Python	Fix	loss of precision	inaccurate result	Inexact	Fmod computes the element-wise remainder of division. When the divisor is zero, returns NaN for floating point dtypes on both CPU and GPU; raises RuntimeError for integer division by zero on CPU; Integer division by zero on GPU may return any value.	Unit test for Fmod/Remainder fail due to numerical jacobian check. Previously, tests for Fmod and Remainder added 5e-2 to the denominator tensor (the same as the div tests), which only avoids divide by 0, but not issues with computing the numerical jacobian due to non-linearity of fmod/remainder, when input / divisor is close to an integer.	gradients/derivatives	automatic differentiation	testing accuracy, automatic differentiation, remainer, numerical jacobian	rewrite math formula	rewrite math formula	Ensure that the result of input / divisor is not close to an integer. Add 1.5 to denominator to make it more likely that it it will not be an integer result . Add 1.5 to denominator instead of 5e-2. Note: this is no longer in Pytorch, specifically decrease probability of numerical issues with numerical jacobian computation.					remainder
48	PyTorch	67968cb60b1d3021834594967d4140a36a8213e3	Python	Fix	overflow/loss of precision			Binary cross entropy with logits measures the probability error in tasks with two outcomes in which each outcome is independent and need not have a fully certain label. For instance, one could perform a regression where the probability of an event happening is known and used as a label. This loss may also be used for binary classification, where labels are either zero or one.	Using sigmoid followed by a BCE loss layer can be less stable than a single layer that combines sigmoid with BCE Loss	loss functions	loss functions	binary cross entropy loss	use a different algorithm	use a different algorithm	combinine sigmoid and BCE loss into one layer and utilize the log-sum-exp trick. This is more stable than using a plain sigmoid followed by a BCE loss		def binary_cross_entropy_with_logits(input, target, weight=None, size_average=True): + r"""Function that measures Binary Cross Entropy between target and output logits: + + See :class:`~torch.nn.BCEWithLogitsLoss` for details. + + Args: + input: Variable of arbitrary shape + target: Variable of the same shape as input + weight (Variable, optional): a manual rescaling weight + if provided it's repeated to match input tensor shape + size_average (bool, optional): By default, the losses are averaged + over observations for each minibatch. However, if the field + sizeAverage is set to False, the losses are instead summed + for each minibatch. + """ + if weight is not None and target.dim() != 1: + weight = weight.view(1, target.size(1)).expand_as(target) + neg_abs = - input.abs() + loss = input.clamp(min=0) - input * target + (1 + neg_abs.exp()).log() + + if weight is not None: + loss = loss * weight + + if size_average: + return loss.mean() + else: + return loss.sum()		loss(o, t) = - 1/n \sum_i (t[i] * log(sigmoid(o[i])) + (1 - t[i]) * log(1 - sigmoid(o[i])))	log, sigmoid
49	PyTorch	7ba5e7cea1d2be485d2806ad38608dad9bcc7041	Python	Fix	loss of precision			Pooling layers are used to reduce the dimensions of the feature maps and to summarize them. A max pooling layer returns the maximum values of rectangular regions of its input. Boundary conditions (b.c.) are constraints necessary for the solution of a boundary value problem. A boundary value problem is a differential equation (or system of differential equations) to be solved in a domain on whose boundary a set of conditions is known.	VolumetricMaxPooling (in legacy.nn) precision test kept failing there were these one set of indices that were in the same Pooling window that differed by less than epsilon. So, the numeric gradient was hitting boundary conditions (max-pooling is discontinuous of course)	CNN operations	pooling layer	testing, max pooling	rewrite math formula	rewrite math formula	modify the test to not have the input tensor have these boundary conditions, using torch.rand, which eturns a tensor filled with random numbers from a uniform distribution on the interval [0, 1) and then multipling by 1000	input_size=(2, 3, 5, 5, 5))	input=(torch.randn(2, 3, 5, 5, 5) * 1000)),
50	PyTorch	a03692069ebe19038bfccf5a59208ed2989bd4d9	Python	Unit test	loss of precision				Unit test sometimes failing because of numerical gradient approximation error	loss functions	loss functions	caffe2, loss	relax accuracy test tolerance	relax accuracy test tolerance	Increase the tollerance when comparing the gradient to make test pass	delta=1e-3	delta=1e-2 * abs(np.asscalar(dx[0])))
51	PyTorch	33cc71dc55db073ba46b065e24cff0d26156376f	C	Fix	loss of precision			Returns a 1-D tensor of size (start-end)/step + 1 with values from start to end with step step. Step is the gap between two values in the tensor.	Precision can be lost when floats get very small, unexpected behavior	tensor math	tensor math	range (i.e.: 1 D tensor)	rewrite math formula	rewrite math formula	When dividing (xmax - xmin) by step, the numerator can become very small of xmax and xmin are close to each other. It is mathematically correct and also more stable when xmax and xmin are close to each other to distribute the divisions, then perform subtraction, therefore (xmax / step) - (xmin / step). Note: This function is deprecated and will be removed in a future release because its behavior is inconsistent with Python’s range builtin. Instead, use torch.arange(), which produces values in [start, end).	void THTensor_(range)(THTensor *r_, real xmin, real xmax, real step) size = (long)((xmax-xmin)/step+1);	void THTensor_(range)(THTensor *r_, accreal xmin, accreal xmax, accreal step) size = (long)((xmax/step - xmin/step)+1);
52	PyTorch	87fcf3072ef988b5b2e408cce141b76235929bbd	C++	Fix	overflow			Hsum_sq performs horizontal sum of squares over a range of uint8_t, returns row sum	The quantized version of hsum_sq has an overflow when input image size is large such as (H,W,D) as (224,224,160)	quantization	quantization	quantization, sum of squares	use a different algorithm	use a different algorithm	Rewrite for loop definition to include overflow threshold to prevent overflow	for (; i < len / 16 * 16; i += 16) { }	int overflow_threshold = 262144; // 2147483647(max of int32)/(256256)8 = 262144 int loop = len / overflow_threshold + 1; for(int j=0; j<=loop; j++){ for (; ((i < overflow_threshold * j) && (i < len / 16 * 16)); i += 16) {
53	PyTorch	45aaaef22cdc9d87f2c04762fce9ffeeff290330	Python	Unit test	overflow	exception		Python uses arbitrary precision integers, which can scale to be as large as needed, up to the amount of memory available to the computer. C++ primitives are fixed in their precision, and are commonly either 8, 16, 32, or 64 bits.	A timing function in code used for benchmarking has can overflow when calling C++ code	precision tests/speed benchmarks	timing	benchmarking timing	add overflow check	add overflow check	Check if the operation would overflow a 32 bit signed primitive from Python before using this value in C++. Add a break statement with overflow threshold condition to prevent overflow		# Avoid overflow in C++ pybind11 interface + if number * 10 > 2147483647: + break
54	PyTorch	c675727adf36bdbb60933c9c7529d3ee34462093	C++	Fix	overflow			torch.empty(size) returns a tensor filled with uninitialized data. The shape of the tensor is defined by the variable argument size	Incorrect error message that fails to indicate an overflow. Overflow occurs when the input into torch.empty is very large	tensor math	tensor math	torch.empty (i.e.: tensor with unitialized data)	fix test/warning	correct error message	change error message to indicate overflow	TypeError: empty(): argument 'size' must be tuple of ints, but found element of type int at pos 1	RuntimeError: Overflow when unpacking long
55	PyTorch	a69910868a5962e2d699c6069154836e262a29e2	Python	Fix	overflow			DistributedSampler restricts data loading to a subset of the dataset. DistributedSampler takes a dataset as input and loads a sample of it. torch.utils.data.distributed.DistributedSampler(dataset, num_replicas=None, rank=None, shuffle=True, seed=0, drop_last=False) num_replicas (int, optional): Number of processes participating in distributed training. By default, :attr:`world_size` is retrieved from the current distributed group.	DistributedSampler takes a dataset as input and loads a sample of it. When `len(dataset) * 2 < num_replica`, there is a possibility of overflow	statistical distributions	data sampling	distributions, sampling, data loading	rewrite math formula	rewrite math formula	rewrite formula for indexing data points in dataset and add if else logic	indices += indices[:(self.total_size - len(indices))]	padding_size = self.total_size - len(indices) + if padding_size <= len(indices): + indices += indices[:padding_size] + else: + indices += (indices * math.ceil(padding_size / len(indices)))[:padding_size]
56	PyTorch	6debe825beb36fc8e894a1b0a14bd5b4ebcd6090	GLSL, Python, C++	New feature	loss of precision			Vulcan is a graphics and compute open standard API. GLSL (OpenGL Shading Language), a special OpenGL Shading Language with syntax similar to C. A shader is essentially a function required to draw something on the screen. Shaders run on a GPU. The RelaxedPrecision allows 32-bit integer and 32-bit floating-point operations to execute with a relaxed precision of somewhere between 16 and 32 bits. More info: https://www.khronos.org/registry/spir-v/specs/1.0/SPIRV.html	Add new feature to allow relaxed precision mode via a cmake option	non-standard precision	non-standard precision	GLSL shaders, GPU	add new precision option	add new precision option	Introduces cmake option USE_VULKAN_RELAXED_PRECISION that controls which precision will be used in Vullkan shaders. This option allows to relax precision executes operations in 16 to 32 bit range precision on Vulcan. Note, the default setting is 32 bit precision.	N/A	option(USE_VULKAN_RELAXED_PRECISION "Use Vulkan relaxed precision(mediump)" OFF) +if(USE_VULKAN_RELAXED_PRECISION) + string(APPEND CMAKE_CXX_FLAGS " -DUSE_VULKAN_RELAXED_PRECISION") +endif()
57	PyTorch	324c18fcad579b1afa63ae45528bf598ba8ec4ca	Cuda	Fix	underflow			Computes division a/b using formula a * (1/b)	Division operation, where the denominator is a low precision scalar has a risk of underflow. Inverse by division was calculated using the same precision of the non-scalar operands.	tensor math	tensor math	Cuda, division	increase variable precision/change variable type	change variable type	Replace the type used for accumulation to the same type as the opperands. Replace scalar_t with accscalar_t.	auto inv_b = scalar_t(1.0) / iter.scalar_value<scalar_t>(2);	using accscalar_t = at::acc_type<scalar_t, true>; auto inv_b = accscalar_t(1.0) / iter.scalar_value<accscalar_t>(2);	@onlyCUDA + @dtypes(torch.half) + def test_divmul_scalar(self, device, dtype): + x = torch.tensor(100., device=device, dtype=dtype) + x_ref = x.float() + scale = 1e5 + res = x.div(scale) + expected = x_ref.div(scale) + self.assertEqual(res, expected.to(dtype), atol=0., rtol=0.) + x = torch.tensor(1e-5, device=device, dtype=dtype) + x_ref = x.float() + res = x.mul(scale) + expected = x_ref.mul(scale) + self.assertEqual(res, expected.to(dtype), atol=0., rtol=0.) + res = scale * x + self.assertEqual(res, expected.to(dtype), atol=0., rtol=0.)		division
58	PyTorch	24a8614cac3af1711eccc7294fd47ac30aefa8cc	Python	Add warning	overflow			cuFFT = CUDA Fast Fourier Transform library		non-standard precision	non-standard precision	CUDA, half precision, warning	disable test/warning	add overflow warning	Add a warning message to warn programmer of possible overflow when operation performed in half precision. Message: "Due to limited dynamic range of half datatype, performing this operation in half precision may cause the first element of result to overflow for certain inputs"
59	PyTorch	fe684679b06f7f2fe7a7e136ea5605c04254b652	C++	disable test	overflow	runtime error		The csrc directory contains all of the code concerned with integration with Python. This is in contrast to lib, which contains the Torch libraries that are Python agnostic. csrc depends on lib, but not vice versa.	Runtime error from overflow when unpacking large numbers. The bug is: torch.tensor([0.1, 999999999999999999999]) fails with "Overflow when unpacking double"	other	other	Convert Python float to C++ float, Python integration	disable test/warning	disable overflow and precision test	Delete code that throws an exception on overflow and lost precision	- if (PyLong_Check(obj)) { - int overflow; - long long value = PyLong_AsLongLongAndOverflow(obj, &overflow); - if (overflow != 0) { - throw std::runtime_error("Overflow when unpacking double"); - } - if (value > DOUBLE_INT_MAX \|\| value < -DOUBLE_INT_MAX) { - throw std::runtime_error("Precision loss when unpacking double"); - } - return (double)value; - }	delete old solution	def test_unpack_double(self, device, dtype): + # Reference: https://github.com/pytorch/pytorch/issues/33111 + vals = (2 24 + 1, 2 53 + 1, + np.iinfo(np.int64).max, np.iinfo(np.uint64).max, np.iinfo(np.uint64).max + 1, + -1e500, 1e500) + for val in vals: + t = torch.tensor(val, dtype=dtype, device=device) + a = np.array(val, dtype=torch_to_numpy_dtype_dict[dtype]) + self.assertEqual(t, torch.from_numpy(a))
60	PyTorch	7417b4c66f5b0901f206bf48b64de07384770724	Cuda	Fix	overflow			ConvTranspose3d applies a 3D transposed convolution operator over an input image composed of several input planes. The transposed convolution operator multiplies each input value element-wise by a learnable kernel, and sums over the outputs from all input feature planes. torch.nn.ConvTranspose3d(in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros')	The index in torch.nn.ConvTranspose3d overflows	CNN operations	convolution	convolution transpose	add test/warning	change variable type, increase variable precision	Change variable type of index from int to unsigned. equires that input.numel() <= UINT_MAX, and channels * kernel.numel() <= UINT_MAX. Note: this is a second attept to fix the problem	int data_col_index = - (((((c_im * kernel_t + t_k) * kernel_h + h_k) * kernel_w + - w_k) * - depth_col + - t_col) * - height_col + - h_col) * - width_col + - w_col;	const int64_t idx_k = + ((c_im * kernel_t + t_k) * kernel_h + h_k) * kernel_w + w_k; + const int64_t data_col_index = + ((idx_k * depth_col + t_col) * + height_col + h_col) * + width_col + w_col; val += data_col[data_col_i	const auto num_kernels = channels * depth * height * width; + + auto check_fits_in_unsigned = + [](int64_t val, const char * name) { + constexpr auto umax = std::numeric_limits<unsigned>::max(); + TORCH_CHECK(val >= 0 && val <= umax, + name, " must fit in a 32-bit unsigned value"); + }; + check_fits_in_unsigned(num_kernels, "input size"); + check_fits_in_unsigned( + channels * patch_t * patch_h * patch_w, "channels x kernel size");
61	PyTorch	0a159b0a3a78a80fb0f9082087a98f87f2dea986	C++	Fix	loss of precision	inaccurate/incorrect result			torch.remainder gives the wrong output for very large float dividends due to loss of precision. For example, x = torch.tensor(2749682432.0) q = 36 print(torch.remainder(x,q)) actual output is 128.0 whereas the correct output should be 20	tensor math	tensor math	remainder	use a different algorithm	use a different algorithm	Use sleef library to calculate mod for floats. Use sleef_fmod8, a vectorized single precision FP remainder.	return a - b * at::native::floor_impl(a / b);	Vec256<BFloat16> fmod(const Vec256<BFloat16> & q) const { + __m256 x_lo, x_hi; + cvtbf16_fp32(values, x_lo, x_hi); + __m256 q_lo, q_hi; + cvtbf16_fp32(q.values, q_lo, q_hi); + auto o1 = Sleef_fmodf8(x_lo, q_lo); + auto o2 = Sleef_fmodf8(x_hi, q_hi); + return cvtfp32_bf16(o1, o2); scalar_t mod = std::fmod(a, b); if ((mod != 0) && ((b < 0) != (mod < 0))) mod += b; return mod;			modulo
62	PyTorch	63b1ae69831cd21bc4d6059a5854bc1155a152c9	Cuda	Fix	overflow			C++ std:: fmod definition: The floating-point remainder of the division operation x/y calculated by this function is exactly the value x - n*y, where n is x/y with its fractional part truncated. The returned value has the same sign as x and is less than y in magnitude. If successful, returns the floating-point remainder of the division x/y as defined above. If a domain error occurs, an implementation-defined value is returned (NaN where supported) If a range error occurs due to underflow, the correct result (after rounding) is returned.	overflow in torch.remainder when dividend is very large	tensor math	tensor math	remainder	rewrite math formula	rewrite math formula	Use fmod from C++ standard library to calculate remainder instead of a - b * floor(a/b). And account for an edge case: if the result of fmod is not zero (i.e.; a is not divisible by b) and either (1) the divisor is less than zero while the remainder is greater than zero, or (2) the divisor is greater than zero while the remainder is less than zero. If that is the case, increment the result of fmod by the divisor.	return a - b * static_cast<scalar_t>(std::floor(a / b));	auto mod = ::fmod(a, b); + if ((mod != 0) && ((b < 0) != (mod < 0))) mod += b; + return mod;	def test_remainder_fmod_large_dividend(self, device, dtype): + alarge = 1e9 + pi = 3.14159265358979 + for avalue in [alarge, -alarge]: + for bvalue in [pi, -pi]: + a = torch.tensor([avalue], dtype=dtype, device=device) + b = torch.tensor([bvalue], dtype=dtype, device=device) + c = torch.remainder(a, b) + d = torch.fmod(a, b) + self.assertTrue((b[0] > 0) == (c[0] > 0)) # remainder has same sign as divisor + self.assertTrue((a[0] > 0) == (d[0] > 0)) # fmod has same sign as dividend + self.assertTrue(abs(c[0]) < abs(b[0])) # remainder is within range of divisor + self.assertTrue(abs(d[0]) < abs(b[0])) # fmod is within range of divisor + if ((a[0] > 0) == (b[0] > 0)): + self.assertTrue(c[0] == d[0]) # remainder is same as fmod + else: + self.assertTrue(abs(c[0] - d[0]) == abs(b[0])) # differ by one divisor		remainder, division
63	PyTorch	b33e38ec475017868534eb114741ad32c9d3b248	C++	Fix	loss of precision			arrange creates a 1D tensor using start, end, and step size	Step and input have the same type. Variable step may require higher precision than variables start and end. I think this is when step is a very small number.	tensor math	tensor creation	vectorized calculations, low level tensor math, CPU	increase variable precision/change variable type	increase variable precision	Allow a higher-precision step type for Vec256::arange. Setting the type of step to be independent of the input type. Often a double is required for this while the input remains a single.	static Vec256<T> arange(T base = static_cast<T>(0), T step = static_cast<T>(1))	template<typename step_t> // step sometimes requires a higher precision type (e.g., T=int, step_t=double) static Vec256<T> arange(T base = static_cast<T>(0), step_t step = static_cast<step_t>(1)) {	N/A
64	PyTorch	5c423cae72b3b720a0857a8237a499d0e07d6b98	Python	Unit test	loss of precision			Linspace creates a 1D tensor of size steps whose values are evenly spaced from start to end, inclusive. Logspace creates a 1D tensor of size steps whose values are evenly spaced from base^start to base^end inclusive, on a logarithmic scale with base "base".	Precision of Cuda half precision computation of linspace and logspace seems bad	tensor math	tensor creation	testing precision, Cuda, half precision, linspace, logspace	add test/warning	add precision test	adds precision tests for CUDA half (16 bits), float (32 bits), and double (64 bits). Since linspace/logspace are deterministic, we can compute an expected amount of error (by testing without a precision override), adding a tiny amount (EPS) to that, and using that value as the override. EPS = 1e-5		LINSPACE_LOGSPACE_EXTRA_EPS = 1e-5 + # Tests that compare a device's computation with the (gold-standard) CPU's. class TestDevicePrecision(TestCase): - def test_linspace(self, device): - a = torch.linspace(0, 10, 10, device=device) - b = torch.linspace(0, 10, 10) + + # The implementation of linspace+logspace goes through a different path + # when the steps arg is equal to 0 or 1. For other values of `steps` + # they call specialized linspace (or logspace) kernels. + LINSPACE_LOGSPACE_SPECIAL_STEPS = [0, 1] + + def _test_linspace(self, device, dtype, steps): + a = torch.linspace(0, 10, steps=steps, dtype=dtype, device=device) + b = torch.linspace(0, 10, steps=steps) self.assertEqual(a, b) - @dtypes(torch.double) - def test_logspace(self, device, dtype): - a = torch.logspace(1, 10, 10, dtype=dtype, device=device) - b = torch.logspace(1, 10, 10, dtype=dtype, device='cpu') + # See NOTE [Linspace+Logspace precision override] + @precisionOverride({torch.half: 0.0039 + LINSPACE_LOGSPACE_EXTRA_EPS}) + @dtypesIfCUDA(torch.half, torch.float, torch.double) + @dtypes(torch.float, torch.double) + def test_linspace(self, device, dtype): + self._test_linspace(device, dtype, steps=10) + + @dtypesIfCUDA(torch.half, torch.float, torch.double) + @dtypes(torch.float, torch.double) + def test_linspace_special_steps(self, device, dtype): + for steps in self.LINSPACE_LOGSPACE_SPECIAL_STEPS: + self._test_linspace(device, dtype, steps=steps) + + def _test_logspace(self, device, dtype, steps): + a = torch.logspace(1, 1.1, steps=steps, dtype=dtype, device=device) + b = torch.logspace(1, 1.1, steps=steps) self.assertEqual(a, b) - # Check non-default base=2 - a = torch.logspace(1, 10, 10, 2, dtype=dtype, device=device) - b = torch.logspace(1, 10, 10, 2, dtype=dtype, device='cpu') + def _test_logspace_base2(self, device, dtype, steps): + a = torch.logspace(1, 1.1, steps=steps, base=2, dtype=dtype, device=device) + b = torch.logspace(1, 1.1, steps=steps, base=2) self.assertEqual(a, b) + # See NOTE [Linspace+Logspace precision override] + @precisionOverride({torch.half: 0.0157 + LINSPACE_LOGSPACE_EXTRA_EPS}) + @dtypesIfCUDA(torch.half, torch.float, torch.double) + @dtypes(torch.float, torch.double) + def test_logspace(self, device, dtype): + self._test_logspace(device, dtype, steps=10) + + # See NOTE [Linspace+Logspace precision override] + @precisionOverride({torch.half: 0.00201 + LINSPACE_LOGSPACE_EXTRA_EPS}) + @dtypesIfCUDA(torch.half, torch.float, torch.double) + @dtypes(torch.float, torch.double) + def test_logspace_base2(self, device, dtype): + self._test_logspace_base2(device, dtype, steps=10) + + @dtypesIfCUDA(torch.half, torch.float, torch.double) + @dtypes(torch.float, torch.double) + def test_logspace_special_steps(self, device, dtype): + for steps in self.LINSPACE_LOGSPACE_SPECIAL_STEPS: + self._test_logspace(device, dtype, steps=steps) + self._test_logspace_base2(device, dtype, steps=steps) \		start, start + (end-start)/(steps-1), ..., start + (steps - 2) * (end-start)/(steps-1) https://pytorch.org/docs/stable/generated/torch.logspace.html
65	PyTorch	b9b9fd4fadc4d4fa0b030941a35011956eafa10b	C++	Disable warning	overflow			Warning pragma enables selective modification of the behavior of compiler warning messages. The pragma warning( push ) stores the current warning state for every warning. The pragma warning( push, n ) stores the current state for every warning and sets the global warning level to n. The pragma warning( pop ) pops the last warning state pushed onto the stack. Any changes that you made to the warning state between push and pop are undone.	False arithmetic overflow warning in MSVC ( Microsoft Visual compiler for C, C++) results in code not compiling	compiler	compiler	overflow warning, Microsoft compiler	disable test/warning	disable warning	Disable warnings for arithmetic overflow raised by MSVC (Microsoft Visual C Compiler). Add logic to ignore warning using warning pragma: pragma warning(disable : 4146) that allows for ignoring specified warning messages. Also push and pop are used.		// Ignore the false warning "Arithmetic overflow" for MSVC + #ifdef _MSC_VER + # pragma warning(push) + # pragma warning(disable : 4146) + #endif + /// Gets the minimum value for a N-bit signed integer. inline int64_t minIntN(int64_t N) { assert(N > 0 && N <= 64 && "integer width out of range"); + return -(UINT64_C(1) << (N - 1)); } + #ifdef _MSC_VER + # pragma warning(pop) + #endif
66	PyTorch	ec8e75ea92ae2b5ea73b4aeb3ec7cb39e9f95db9	Cuda	Fix	overflow			Histograms are an important data representation with many applications in computer vision, data analytics and medical imaging. Histogram is a popular analytic graphical representation of data distribution resulting from processing a given numerical input data.	Not enough bits to represent the necessary values using an int for nbins. getBin function in Cuda overflows for large bVal and nbins values => (bVal - minvalue) * nbins = inf	other	other	Cuda histogram	increase variable precision/change variable type	increase variable precision	Patch: increase precision from int (32 bits) to int64			t = torch.zeros([10], dtype=torch.int32, device='cuda') + # 35488 * 65536 as int32 would cause overflow to negative value + # giving negative bin offset + t[0] = 35488 + counted = t.bincount(minlength=65536) + self.assertEqual(torch.sum(counted), 10)
67	PyTorch	17c1b2c7159a0218a69e8486eb4212339253353a	Python	Fix	overflow			Saturation arithmetic is a version of arithmetic in which all operations such as addition and multiplication are limited to a fixed range between a minimum and maximum value. If the result of an operation is greater than the maximum, it is set ("clamped") to the maximum; if it is below the minimum, it is clamped to the minimum. The name comes from how the value becomes "saturated" once it reaches the extreme values; further additions to a maximum or subtractions from a minimum will not change the result.	In quantization code, range (i.e.: the min and max values of fixed range), which is used as fallback onto default 8-bit qmin and qmax calculation if dynamic range is not used, can cause overflow	quantization	quantization	quantization, range	rewrite math formula	rewrite math formula	Change range: relax scale and zero-point for activations to ensure that fbgemm implementations of conv and linear do not saturate due to 16 bit intermediate accumulation. But now in Pytorch: "Please use quant_min and quant_max to specify the range for observers. reduce_range will be deprecated in a future release of PyTorch."	if self.dtype == torch.qint8: - qmin, qmax = -128, 127 else: - qmin, qmax = 0, 255	@@ -59,9 +61,15 @@ class ObserverBase(ABC, nn.Module): ) if self.dtype == torch.qint8: + if self.reduce_range: + qmin, qmax = -64, 63 + else: + qmin, qmax = -128, 127 else: + if self.reduce_range: + qmin, qmax = 0, 127 + else: + qmin, qmax = 0, 255	class ObserverTest(QuantizationTestCase): @given(qdtype=st.sampled_from((torch.qint8, torch.quint8)), - qscheme=st.sampled_from((torch.per_tensor_affine, torch.per_tensor_symmetric))) - def test_minmax_observer(self, qdtype, qscheme): - myobs = MinMaxObserver(dtype=qdtype, qscheme=qscheme) + qscheme=st.sampled_from((torch.per_tensor_affine, torch.per_tensor_symmetric)), + reduce_range=st.booleans()) + def test_minmax_observer(self, qdtype, qscheme, reduce_range): + # reduce_range cannot be true for symmetric quantization with uint8 + if qdtype == torch.quint8 and qscheme == torch.per_tensor_symmetric: + reduce_range = False + myobs = MinMaxObserver(dtype=qdtype, qscheme=qscheme, reduce_range=reduce_range)
68	PyTorch	c845984271a551ac1c61b9eb06a17fb57aafbd7e	Cuda	Fix	overflow				A loop from i to n uses int to store the index i, which overflows after it is incremented. Overflow makes the index negative, which will also cause buffer overflow	other	other	looping, Cuda	increase variable precision/change variable type	increase variable precision	increase precision from int to int 64	#define CUDA_KERNEL_LOOP(i, n) \ for (int i = blockIdx.x * blockDim.x + threadIdx.x; i < (n); i += blockDim.x * gridDim.x)	// CUDA: grid stride looping +// int64_t _i_n_d_e_x specifically prevents overflow in the loop increment. +// If input.numel() < INT_MAX, _i_n_d_e_x < INT_MAX, except after the final +// iteration of the loop where _i_n_d_e_x += blockDim.x * gridDim.x can be +// greater than INT_MAX. But in that case _i_n_d_e_x >= n, so there are no +// further iterations and the overflowed value in i=_i_n_d_e_x is not used. #define CUDA_KERNEL_LOOP(i, n) \ + int64_t _i_n_d_e_x = blockIdx.x * blockDim.x + threadIdx.x; \ + for (int i=_i_n_d_e_x; _i_n_d_e_x < (n); _i_n_d_e_x+=blockDim.x * gridDim.x, i=_i_n_d_e_x)	@unittest.skipIf(not TEST_MEDIUM_TENSOR, "not enough memory") + def test_cuda_kernel_loop_overflow(self): + # Issue #24309: In extreme cases, the loop variable could overflow and continue + # the kernel loop with a negative index, causing a RuntimeError (invalid write): + x = torch.randn(1, 1, 1, 230 + 1, dtype=torch.float16, device="cuda") + expected = x[0, 0, 0, 230] + y = torch.nn.functional.avg_pool2d(x, kernel_size=1) + torch.cuda.synchronize() + self.assertEqual(y[0, 0, 0, 230], expected) + + @unittest.skipIf(not TEST_LARGE_TENSOR, "not enough memory") + def test_cuda_kernel_loop_overflow_large(self): + # Make sure input.numel() > INT_MAX is handled: + x = torch.randn(1, 1, 1, 231, dtype=torch.float16, device="cuda") + with self.assertRaisesRegex(RuntimeError, "integer out of range"): + y = torch.nn.functional.avg_pool2d(x, kernel_size=1) + + # Issue #24309: In extreme cases, the loop variable could overflow and continue + # the kernel loop with a negative index, causing a RuntimeError (invalid write): + x = torch.randn(1, 1, 1, 231 - 1, dtype=torch.float16, device="cuda") + expected = x[0, 0, 0, 231 - 2] + y = torch.nn.functional.avg_pool2d(x, kernel_size=1) + torch.cuda.synchronize() + self.assertEqual(y[0, 0, 0, 2**31 - 2], expected)
69	PyTorch	4d2bf0b51b71f96929b58c6e23fb71d3e25440ff	Python	Unit test	loss of precision				backward pass output in quantization aware training was not accurate enough	quantization	quantization	quantization aware training, testing precision, backward pass	increase variable precision/change variable type	increase variable precision	Increase precision from float to double
70	PyTorch	af908d57ea07c593bb7c8db00c3139fc973b2d4c	Python	Unit test	loss of precision				Test for quantized operations’s precision is failing in function def test_adaptive_avg_pool2d(self, X, output_size_h, output_size_w) due to double rounding	quantization	quantization	quantization, precision testing, average pooling	relax accuracy test tolerance	relax accuracy test tolerance	Increase unittest precision tolerance to 1.0 to avoid failing	- self.assertEqual(X_ref, qX_repr, - message=error_message.format(name, X_ref, qX_repr))	self.assertEqual(X_ref, qX_hat.int_repr(), prec=1.0, message=error_message.format(name, X_ref, qX_hat))
71	PyTorch	83bfd76b2f7a9b388537eb00022622d9c6989890	Python	Unit test	loss of precision			absolute tolerance (atol). An absolute tolerance is a fixed number that is used to make direct comparisons	Test in function make_input(batch_size) in class ONNX Runtime (ONNX=Open Neural Network Exchange) fails AssertionError: Not equal to tolerance rtol=0.001, atol=1e-07	other	other	GRU (Gated Recurrent Unit in RNN)	relax accuracy test tolerance	relax accuracy test tolerance	relax precision tolerance, absolute tolerance (atol) = 1e-5	self.run_test(model, input, batch_size=RNN_BATCH_SIZE,)	self.run_test(model, input, batch_size=RNN_BATCH_SIZE, atol=1e-5)
72	PyTorch	77651615c8976b6ad7ddd8abf2a62cd54b573f56	C++	Fix	loss of precision			CHAR_BIT indicates how many bits are in a char. On almost every architecture today it's 8 bits to a char, but on some historical machines it has been 7.	The previous code mistook the number of decimal digits with the binary precision that this gemm implementation expected	quantization	quantization	quantization, fbgemm	increase variable precision/change variable type	increase variable precision	Use the correct number of binary precision. Interestingly, while C expects everything in terms of number of bytes, this library expects precision to be in number of bits, so CHAR_BIT must be multiplied by the result of sizeof (sizeof returns number of bytes) in order to get this number in bits.	qparams.precision = std::numeric_limits<typename T::underlying>::digits;	qparams.precision = CHAR_BIT * sizeof(typename T::underlying);
73	PyTorch	9b69f21a95fa626522ef371f8557e7286f9db318	C++	Fix	loss of precision			The Code Generator (codegen.h/cpp) produces the string to be compiled on the device. Csrc directory in Pytorch repo contains all of the code concerned with integration with Python. This is in contrast to lib, which contains the Torch libraries that are Python agnostic. csrc depends on lib, but not vice versa. Jit directory contains (most of) the C++ code for the PyTorch JIT, a language and compiler stack for executing PyTorch models portably and efficiently. The fuser accepts subgraphs wrapped in "fusion nodes" and tries to execute them by just-in-time (JIT) compiling kernels that run all the graph operations. fuser - identify processes using files or sockets just-in-time (JIT) compilation (also dynamic translation or run-time compilations)[1] is a way of executing computer code that involves compilation during execution of a program – at run time – rather than before execution.[ Std::scientific modifies the default formatting for floating-point input/output. Specifically, write floating-point values in scientific notation Sets the floatfield format flag for the str stream to scientific. When floatfield is set to scientific, floating-point values are written using scientific notation: the value is represented always with only one digit before the decimal point, followed by the decimal point and as many decimal digits as the precision field (precision). Finally, this notation always includes an exponential part consisting on the letter e followed by an optional sign and three exponential digits. Std::setprecision When used in an expression out << setprecision(n) or in >> setprecision(n), sets the precision parameter of the stream out or in to exactly n.	low precision emitted for prim:: Constant	compiler	compiler	code generation for compiler, fuser, JIT	increase variable precision/change variable type	increase variable precision	Patch 1: Emit higher precision literal for float values v in the fusion kernel using std::setprecision instead of std::scietific. Patch 2: increase precision in code that sets variable types: int to int 64 and float to double		// Note: The NAN, NEG_INFINITY and POS_INFINITY strings map to device-specific // implementations of these special values. These macros are found in the // resource strings for each device. static std::string scalarValue(const double v) { std::ostringstream out; if (std::isnan(v)) { out << "NAN"; } else if (std::isinf(v)) { if (v < 0) { out << "NEG_INFINITY"; } else { out << "POS_INFINITY"; } } else { out << std::setprecision(16) << v; } return out.str(); }	@unittest.skipIf(RUN_CUDA, 'This tests the CPU fuser') + @unittest.skipIf(IS_WINDOWS or IS_SANDCASTLE, "NYI: fuser support for Windows or Sandcastle") + @enable_cpu_fuser + def test_fuser_double_literal_precision(self): + code = ''' + graph(%2 : Float(, )): + %4 : int = prim::Constant[value=1]() + %3 : float = prim::Constant[value=1.282549830161864]() + %5 : Float(, ) = aten::add(%2, %3, %4) + %1 : Float(, ) = aten::relu(%5) + return (%1) + ''' + + graph = parse_ir(code) + code = torch._C._jit_fuser_get_fused_kernel_code(graph, [torch.rand(3, 4)]) + FileCheck().check('1.282549830161864').run(code)
74	PyTorch	8e1e29124de99c01d08a2e2c02455c72335a971d	Python	Fix	loss of precision				In various distributions (bernoulli, binomial, etc.) the expansion method chooses to use probabilities over logits, which results in loss of precision	statistical distributions	statistical distributions	distributions	rewrite math formula	rewrite math formula	In method “expand(self, batch_shape, _instance=None)” of the distribution class change logic of preference of probabilities and logits If logits are available, use them over probabilities (not the other way around)
75	PyTorch	2ed95c58713b45a6a9dac4336135523555bc58a9	C++	Disable warning	overflow				error from Microsoft compiler when building	compiler	compiler	Micsrosoft C++ compiler, Converter	disable test/warning	disable warning	disable warning using pragma warning disable		#ifdef _MSC_VER +#pragma warning( disable : 4146 ) +#endif
76	PyTorch	dc72a5e02c1ecb105ea58cafcf10ef3a6f7d9c25	C++	Fix	underflow			CV refers to OpenCV and rotatedRectangleIntersection is a function in OpenCV library rotatedRectangleIntersection finds out if there is any intersection between two rotated rectangles. int cv::rotatedRectangleIntersection ( const RotatedRect & rect1, const RotatedRect & rect2, OutputArray intersectingRegion )	cv::rotatedRectangleIntersection has a known float underflow bug that would cause failure in ```CV_Assert(intersection.size() <= 8)```, Problem reported in OpenCV	data processing	image processing	OpenCV, rotated triangele intersection	use a different algorithm	use a different algorithm	Replace rotatedRectangleIntersection with custom made replacement function cvfix_rotatedRectangleIntersection. When OpenCV version is upgraded to be >= 4.0, we can remove this replacement function.
77	PyTorch	4b97a4642100e26d14c34c07c31643422d60ac48	C++	Disable warning	overflow				compilation error due to signed overflow	compiler	compiler	compiling	disable test/warning	disable warning	Disable strict-overflow flag to avoid compilation error		ADD_COMPILE_OPTIONS(-Wno-strict-overflow) ADD_COMPILE_OPTIONS(-Wno-error=strict-overflow)
78	PyTorch	55b25365e9e11ee4d9dfb02ff1c79081225c7bd1	C++	New feature	loss of precision	N/A	N/A			non-standard precision	non-standard precision	quantization, low precision computations	other	add new algorithm	Add feature to allow 8 bit precision values (ultra low precision)	is_same<T, uint8_t>::value && GetCpuId().avx2();	is_same<T, uint8_t>::value && GetCpuId().avx2() && !FLAGS_caffe2_dnnlowp_force_slow_path;
79	PyTorch	efd2aeac9e03a8813ba37db98e1a7645fa2902be	txt	Disable warning	overflow			Wno-stringop-overflow uses Object Size Checking to determine the sizes of destination objects	stringop-overflow flag is added in only in GCC 7	compiler	compiler	GCC compiler flags	disable test/warning	disable warning	Change logic for compiler flag Wno-stringop-overflow. Set it only if GCC compiler version >= 7		if (CMAKE_COMPILER_IS_GNUCXX AND NOT (CMAKE_CXX_COMPILER_VERSION VERSION_LESS 7.0.0)) + set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-stringop-overflow") + endif()
80	PyTorch	d97c9dd01904ff423554345cd877ebc1e520c21e	Python	Add warning	loss of precision			Check gradients computed via small finite differences against analytical gradients w.r.t. tensors in :attr:`inputs` that are of floating point or complex type and with ``requires_grad=True``. The check between numerical and analytical gradients uses :func:`~torch.allclose`. For most of the complex functions we consider for optimization purposes, no notion of Jacobian exists. Instead, gradcheck verifies if the numerical and analytical values of the Wirtinger and Conjugate Wirtinger derivatives are consistent. Because the gradient computation is done under the assumption that the overall function has a real-valued output, we treat functions with complex output in a special way. For these functions, gradcheck is applied to two real-valued functions corresponding to taking the real components of the complex outputs for the first, and taking the imaginary components of the complex outputs for the second. For more details, check out :ref:`complex_autograd-doc`.	failure of gradient check between numerical and anlytical gradients due to low precision input (the input is numerical gradients)	gradients/derivatives	automatic differentiation	gradients, autograd, testing precision	add test/warning	add precision warning	Add a warning for gradients that require a check between numerical and analytical gradients need to be of double precision
81	PyTorch	4d287f90743e09d1fdc6e2b3519b16c2d1ae3fa3	C++	Fix	overflow				for loop index overflow if input vector is large	tensor math	tensor math	loop index, low level math, summation of scalars	increase variable precision/change variable type	increase variable precision	increase precision from int to int 64	for (int i = k * WIDTH; i != size; i++)	Patch: increase precision from int to int 64 @@ -102,7 +102,7 @@ struct Reduction { sum = std::accumulate(buf, buf + WIDTH, scalar_t(ident), ReduceScalar()); } + for (int64_t i = k * WIDTH; i != size; i++) { sum = ReduceScalar()(sum, data[i]); } return sum;
82	PyTorch	7cbe63da8621b6063c864527592db6b1c894804f	Cuda	Fix	loss of precision					statistical distributions	statistical distributions	Distributions (Multinomial), THT Tensor Random, binarySearchForMultinomial	rewrite math formula	rewrite math formula	// first non-zero element by setting start to size-1 here, + // the code below will move it to the last non-zero probability + // this actually can happen when the random number is 1	start = 0;	start = size - 1;	# Test a corner case from older PyTorch (Issue #4858) + freqs = torch.cuda.FloatTensor([ + 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, + 0.03178183361887932, 0.027680952101945877, 0.033176131546497345, + 0.046052902936935425, 0.07742464542388916, 0.11543981730937958, + 0.14148041605949402, 0.15784293413162231, 0.13180233538150787, + 0.08271478116512299, 0.049702685326337814, 0.027557924389839172, + 0.018125897273421288, 0.011851548217236996, 0.010252203792333603, + 0.007422595750540495, 0.005372154992073774, 0.0045109698548913, + 0.0036087757907807827, 0.0035267581697553396, 0.0018864056328311563, + 0.0024605290964245796, 0.0022964938543736935, 0.0018453967059031129, + 0.0010662291897460818, 0.0009842115687206388, 0.00045109697384759784, + 0.0007791675161570311, 0.00020504408166743815, 0.00020504408166743815, + 0.00020504408166743815, 0.00012302644609007984, 0.0, + 0.00012302644609007984, 4.100881778867915e-05, 0.0, 0.0, 0.0, 0.0, + 0.0, 0.0]) + + torch.cuda.manual_seed(11042) + sample = torch.multinomial(freqs, 1000, True) + self.assertNotEqual(freqs[sample].min(), 0)
83	PyTorch	0167f76d2a99ced5f4359d8ea77eb6704179b797	Python	Unit test	loss of precision			An absolute tolerance is a fixed number that is used to make direct comparisons. Rtol = relative tolerance		precision tests/speed benchmarks	accuracy testing	testing precision, external to pytorch, onnx	relax accuracy test tolerance	relax accuracy test tolerance	Relax precision, specifically, add absolute tolerance (atol).	np.testing.assert_allclose( ref_outputs[i], outputs[i], rtol=1e-3)	np.testing.assert_allclose( ref_outputs[i], outputs[i], rtol=1e-3, atol=1e-7)
84	PyTorch	4b8f4fc25902e3a325b06e2db415bba9fad7c0ef	Python	New feature	loss of precision	N/A	N/A			non-standard precision	non-standard precision	mixed precision, training	add new precision option	add new precision option	allow mixed precision in distributed training
85	PyTorch	873f1163806c14ae236538f76c44d04b63bef331	Python	Unit test	loss of precision			The STFT computes the Fourier transform of short overlapping windows of the input. This giving frequency components of the signal as they change over time. The interface of this function is modeled after the librosa stft function.	STFT (short time fourier transform) precision test not passing	precision tests/speed benchmarks	accuracy testing	testing precision, fourier transform	relax accuracy test tolerance	relax accuracy test tolerance	increase precision tolerance in assertEqual from 5e-6 to 7e-6	self.assertEqual(result.data, ref_result, 5e-6, 'stft result')	self.assertEqual(result.data, ref_result, 7e-6, 'stft result')
86	PyTorch	f9fd82d8933639a8cf20a029c7fa47fff8fdb93d	Cuda	Fix	loss of precision			__half2float = defined in cuda: cuda_fp16.h. Converts half number to float. ScalarConvert = defined cvd library		non-standard precision	non-standard precision	mixed precision, sigmoid	increase variable precision/change variable type	increase variable precision	Change how to convert half precision variables to float in struct TensorSigmoidOp. Use ScalarConvert instead of __half2float. Change float to accreal, which is int64_t	- float fin = __half2float(in); - out = __float2half(1.0f / (1.0f + expf(- fin))); -#define H2F(input) __half2float(input) -#define F2H(input) __float2half(input)	float fin = ScalarConvert<half, float>::to(in); + out = ScalarConvert<float, half>::to(1.0f / (1.0f + expf(- fin))); +#define H2F(input) ScalarConvert<real, accreal>::to(input) +#define F2H(input) ScalarConvert<accreal, real>::to(input)
87	PyTorch	35abc4efa2d08ef2e9b7d978089fbd98b8d14187	C++	New feature	loss of precision	N/A	N/A	torch.digamma(input, , out=None) → Tensor Computes the logarithmic derivative of the gamma function on input. torch.polygamma(n, input, , out=None) → Tensor Computes the nth derivative of the digamma function on input. n≥0 is called the order of the polygamma function.		gradients/derivatives	derivatives		add new precision option	add new precision option	Add low-precision digamma() and polygamma() functions			def test_digamma(self): + def test(use_double=False): + cpu_tensor = torch.randn(10, 10, 10) + gpu_tensor = cpu_tensor.cuda() + zeros = torch.zeros(10, 10, 10) + if (use_double): + cpu_tensor = cpu_tensor.double() + gpu_tensor = gpu_tensor.double() + zeros = zeros.double() + cpu_out = cpu_tensor.digamma() + gpu_out = gpu_tensor.digamma() + norm_errors = (gpu_out - cpu_out.cuda()) / gpu_out + self.assertEqual(norm_errors, zeros) + + test(True) + test(False) + + def test_polygamma(self): + def test(use_double=False): + cpu_tensor = torch.randn(10, 10, 10) + gpu_tensor = cpu_tensor.cuda() + zeros = torch.zeros(10, 10, 10) + if (use_double): + cpu_tensor = cpu_tensor.double() + gpu_tensor = gpu_tensor.double() + zeros = zeros.double() + for n in [0, 1]: + cpu_out = cpu_tensor.polygamma(n) + gpu_out = gpu_tensor.polygamma(n) + norm_errors = (gpu_out - cpu_out.cuda()) / gpu_out + self.assertEqual(norm_errors, zeros)		log, derivative
88	PyTorch	0443c11f7e4d14dfe5f5b23f4112a4c443d95a9c	Python	Fix	loss of precision			Volta is the codename for a GPU microarchitecture developed by Nvidia The major revision number is 7 for devices based on the Volta architecture, 6 for devices based on the Pascal architecture, 5 for devices based on the Maxwell architecture, 3 for devices based on the Kepler architecture, 2 for devices based on the Fermi architecture, and 1 for devices based on the Tesla architecture. torch.cuda.get_device_capability(device=None) Gets the cuda capability of a device. Return type: tuple(int, int) Returns: the major and minor cuda capability of the device	pre-volta architecture (i.e.: prior to cuda major version 7) half precision needs special handling	non-standard precision	non-standard precision	GPU, cuDNN, RNN, half precision, hardware	other	check hardware	Fix half precision for older (pre-volta) NVIDIA cards. Add in additional logic for checking major cuda capability of currently selected device - check if major cuda capability is greater than 7.	- if version() >= 7000 and int(cuda[0]) >= 9: - lib.cudnnSetRNNMatrixMathType(self, CUDNN_DEFAULT_MATH) - if datatype == CUDNN_DATA_HALF: - lib.cudnnSetRNNMatrixMathType(self, CUDNN_TENSOR_OP_MATH)	if version() >= 7000 and int(cuda[0]) >= 9 and ( + torch.cuda.get_device_capability(torch.cuda.current_device())[0] >= 7): + lib.cudnnSetRNNMatrixMathType(self, CUDNN_DEFAULT_MATH) + if datatype == CUDNN_DATA_HALF: + lib.cudnnSetRNNMatrixMathType(self, CUDNN_TENSOR_OP_MATH)
89	PyTorch	f7a459b28b184dedf265ed8718f85cb483e8284e	Cuda	Fix	overflow			MAGMA is a collection of next generation linear algebra (LA) GPU accelerated libraries designed and implemented by the team that developed LAPACK and ScaLAPACK.	overflow when using magma Ints are commonly only 32 bits. An int should not be used for a size. If you index an array of floats with an int, you can only store up to 8 GB of data in that array, much smaller than many workloads require.	other	external library	extranal library, linear algebra	increase variable precision/change variable type	increase variable precision	Use a 64 bit value for size.	- int n = a_->size[0]; - int nrhs = b_->size[1];	int64_t n = a_->size[0]; + int64_t nrhs = b_->size[1];
90	PyTorch	30bbeb8b87ef815d8a7ce8ab8af9a301fcedfbe9	C++	Change exception to a warning	overflow/loss of precision				exception due to check requiring that type conversions are exact	precision tests/speed benchmarks	accuracy testing	scalars, overflow check, precision check	relax accuracy test tolerance	relax accuracy test tolerance	relax type conversion check - only throw an exception if overflow
91	PyTorch	a92fce18715be7317b5eab1319140899b355eb9f	Python	Unit test	loss of precision				cpu and gpu gradients are not exact	precision tests/speed benchmarks	accuracy testing	precision testing, testing that cpu and cuda gradients are equal	relax accuracy test tolerance	relax accuracy test tolerance	Relax precision tolerance from 0 to 5e-5 for comparing cpu and gpu gradients	self.assertEqual(grid_cpu.grad, grid_cuda.grad)	self.assertEqual(grid_cpu.grad, grid_cuda.grad, prec=5e-5)
92	PyTorch	4c35c630eca9a7a3fbfc8f4bc72ea2fd5ba0dd05	Python	Unit test	loss of precision					precision tests/speed benchmarks	accuracy testing	gradient check	relax accuracy test tolerance	relax accuracy test tolerance	Enable norm gradgradchecks by lowering precision requirements. Add absolute and relative tolerance precision based on empirical observations		gradgradcheck_precision_override = { + 'test_NormFunction_1_5': {'atol': 1e-2, 'rtol': 1e-2}, + 'test_NormFunction_2': {'atol': 1e-2, 'rtol': 1e-2}, + 'test_NormFunction_3': {'atol': 5e-2, 'rtol': 1e-2}, +} + if test_name in gradgradcheck_precision_override: + atol = gradgradcheck_precision_override[test_name]['atol'] + rtol = gradgradcheck_precision_override[test_name]['rtol'] + self.assertTrue(gradgradcheck(apply_fn, input, grad_y, atol=atol, rtol=rtol))
93	PyTorch	eaacfc7e25c0500f933b0c68e63f1d947739df90	Python	Speed optimization	loss of precision				Performance issue in momentum update of SGD due to repeatedly converting gradient from 16 bits to 32 and vice versa. cuda was allocating and freeing chunks of memory frequently because grad was changing sizes	optimizers	optimizers	SGD optimizer, momentum update	increase variable precision/change variable type	increase variable precision	Set grad to a predetermined size of fp_32 so cuda no longer needs to alloc/free this frequently.	net.MomentumSGDUpdate( [grad_fp32, momentum_data, lr, param_fp32], - [grad, momentum_data, param_fp32],	net.MomentumSGDUpdate( [grad_fp32, momentum_data, lr, param_fp32], param_fp32], + [grad_fp32, momentum_data, param_fp32],
94	PyTorch	1f4317be3f02d84e93303193e782c1cb002b26e3	Cuda, C++	New feature	loss of precision	N/A	N/A	ollectives = collective communication in distributed computing Gloo backend for distributed CPU training		non-standard precision	non-standard precision	Distributed training, half precision	add new precision option	add new precision option	Add support for collectives over vectors of half-precision floating point values
95	PyTorch	aec182ae72d51dad0f46cdfe7ff9a41380d7da35	Cuda	New feature	loss of precision	N/A	N/A	Performs a batch matrix-matrix product of matrices in batch1 and batch2. input is added to the final result. batch1 and batch2 must be 3-D tensors each containing the same number of matrices.		linear algebra	linear algebra	tensor math, linear algebra	add new precision option	add new precision option	add support for half precision in tensormath blas in BADDBMM (batch matrix-matrix product)
96	PyTorch	1bf7bc9768fa3f768419884595e08b3bc25913ea	Cuda	Fix	overflow			One often wants the type for the accumulator to be of higher precision than the inputs. When accumulating (summation for example), error will build up more.	sum accumulator had insufficient precision	statistical distributions	statistical distributions	distributions (multinomial)	increase variable precision/change variable type	increase variable precision	change data type for accumulator from T to AccT and add assertion to make sure the sum of distribution did not overflow (i.e.: is not inf)		assert(!isinf(sum));
97	PyTorch	c1ba0fbab3ad3f1a4b2630de9629c4749469eada	C++	New feature	loss of precision	N/A	N/A			non-standard precision	non-standard precision	cuDNN, ReLu, mixed precision	add new precision option	add new precision option	Decide at runtime which precision of types to use
98	PyTorch	26516f667e688ed38c8ded71af8e1abc3a56d5ee	Python	Unit test	loss of precision					tensor math	statistics	testing precision, mean, standard deviation	relax accuracy test tolerance	relax accuracy test tolerance	relax precision tolerance in assertEqual	- self.assertEqual(r[:,:50].std(), 4, 0.2) - self.assertEqual(r[:,:50].std(), 4, 0.2) - self.assertEqual(q.mean(), 2, 0.1) - self.assertEqual(q.std(), 3, 0.1) - self.assertEqual(q.mean(), 0, 0.1) - self.assertEqual(q.std(), 1, 0.1)	self.assertEqual(r[:,:50].std(), 4, 0.3) + self.assertEqual(r[:,:50].std(), 4, 0.3) + self.assertEqual(q.mean(), 2, 0.3) + self.assertEqual(q.std(), 3, 0.3) + self.assertEqual(q.mean(), 0, 0.2) + self.assertEqual(q.std(), 1, 0.2)
99	PyTorch	cd780eb9ec20827a924c658b5960be452797076d	C++	Speed optimization	inefficient algorithm			AXPBY Scales two vectors, adds them to one another and stores result in the vector. In this case the type is double (daxpby) ?axpby perform vector vector operation defined as y:= ax + by, where a and b are scalars and x and y are vectors of length n	In caffe2 CPU math using MKL (MKL is an optimized Intel math library) function CAFFE2_SPECIALIZED_AXPBY(double, d) suffers from underlow. When running caffe2 experiments that calling Exp with many values close to 0 causes MKL's underflow error handler to be called repeatedly, causing significant overhead while the result is correct (e.g. exp(x) = 0).	other	external library	extermal library (MKL), exp, caffe2,	disable test/warning	disable warning	Disable MKL's underflow checker to speed up operation by setting the error mode to VML_ERRMODE_IGNORE	-#define DELEGATE_SIMPLE_UNARY_FUNCTION(T, Funcname, OriginalFunc) \ -template <> \ -void Funcname<T, CPUContext>( \ - const int N, const T* x, T* y, \ - CPUContext* context) { \ - OriginalFunc(N, x, y); \ -} -DELEGATE_SIMPLE_UNARY_FUNCTION(float, Exp, vsExp) -DELEGATE_SIMPLE_UNARY_FUNCTION(double, Exp, vdExp)	#define DELEGATE_SIMPLE_UNARY_FUNCTION(T, Funcname, OriginalFunc, ...) \ + template <> \ + void Funcname<T, CPUContext>( \ + const int N, const T* x, T* y, CPUContext* context) { \ + OriginalFunc(N, x, y, ##__VA_ARGS__); \ + } +DELEGATE_SIMPLE_UNARY_FUNCTION( + float, + Exp, + vmsExp, + VML_HA \| VML_FTZDAZ_OFF \| VML_ERRMODE_IGNORE) +DELEGATE_SIMPLE_UNARY_FUNCTION( + double, + Exp, + vmdExp, + VML_HA \| VML_FTZDAZ_OFF \| VML_ERRMODE_IGNORE)
100	PyTorch	206029bc5a3f179abe97986641ed3ccd3c414126	C++	Fix	overflow				Integer literals are of type int, size index variable overflows if input tensor very big, specifically when input > 2GB).	other	external library	external library, caffe2	increase variable precision/change variable type	increase variable precision	Increase precision of variable that holds tensor size from int 32 to int 64. Instead of passing in an integer literal, do a static cast on an integer literal to a larger datatype for the accumulator type	auto newSize = std::accumulate( - newDims.begin(), newDims.end(), 1, std::multiplies<TIndex>()); @@ -180,7 +183,10 @@ class Tensor { template <class T, class ContextForCopy> void Reserve(const std::vector<T>& newCapacity, ContextForCopy* context) { auto newSize = std::accumulate( - newCapacity.begin(), newCapacity.end(), 1, std::multiplies<TIndex>()); if (newSize * meta_.itemsize() <= capacity_) { return; } @@ -208,7 +214,10 @@ class Tensor { "New outer dimension must be smaller than current."); dims_[0] = outer_dim; size_ = std::accumulate( - dims_.begin(), dims_.end(), 1, std::multiplies<TIndex>()); }	auto newSize = std::accumulate( + newDims.begin(), + newDims.end(), + static_cast<TIndex>(1), + std::multiplies<TIndex>()); if (newSize * meta_.itemsize() <= capacity_) { dims_ = newDims; size_ = newSize; @@ -180,7 +183,10 @@ class Tensor { template <class T, class ContextForCopy> void Reserve(const std::vector<T>& newCapacity, ContextForCopy* context) { auto newSize = std::accumulate( + newCapacity.begin(), + newCapacity.end(), + static_cast<TIndex>(1), + std::multiplies<TIndex>()); if (newSize * meta_.itemsize() <= capacity_) { return; } @@ -208,7 +214,10 @@ class Tensor { "New outer dimension must be smaller than current."); dims_[0] = outer_dim; size_ = std::accumulate( + dims_.begin(), + dims_.end(), + static_cast<TIndex>(1), + std::multiplies<TIndex>()); }
101	PyTorch	5030d76acfcdd48492e988e3fc1aa19bebe9366a	Python	Fix	loss of precision					linear algebra	linear algebra	precision testing for CUDA blas	relax accuracy test tolerance	relax accuracy test tolerance	reduce precision of CUDA blas tests		custom_precision = { 'addbmm': 1e-4, 'addmm': 1e-4, + 'addmv': 1e-4, + 'addr': 1e-4, + 'baddbmm': 1e-4, 'rsqrt': 1e-4, 'cumprod': 1e-4, }
102	PyTorch	a489884da4b63e33ede107261afd6a4a81d9401a	Python	Unit test	loss of precision			torch.addmm(input, mat1, mat2, *, beta=1, alpha=1, out=None) → Tensor Performs a matrix multiplication of the matrices mat1 and mat2. The matrix input is added to the final result. alpha and beta are scaling factors on matrix-vector product between mat1 and mat2 and the added matrix input respectively.		linear algebra	linear algebra	precision testing for matrix multiply	relax accuracy test tolerance	relax accuracy test tolerance	Reduce precision of addmm CUDA test		custom_precision = { 'addbmm': 1e-4, + 'addmm': 1e-4, 'rsqrt': 1e-4, 'cumprod': 1e-4, }		out = Beta * input + Alpha * (mat1_i @ mat2_i)
103	PyTorch	a0fb1ab86e88d5c98733d7e6e5aa3b5811fe24f4	Python	Unit test	loss of precision			torch.rsqrt(input, *, out=None) → Tensor Returns a new tensor with the reciprocal of the square-root of each of the elements of input.		linear algebra	linear algebra	precision testing for matrix multiply and square root	relax accuracy test tolerance	relax accuracy test tolerance	Reduce precision for addmm and rsqrt CUDA tests				out_i = 1/(sqrt(input_i))
104	PyTorch	f7fe6cf1a6a58c55335e1b337dbdd23a78a2f74a	C	Fix	overflow					statistical distributions	statistical distributions	multinomial distribution	increase variable precision/change variable type	increase variable precision	Using higher precision type for accumulator	void THTensor_(multinomial)(THLongTensor self, THGenerator _generator, THTenso for (i=0; i<n_dist; i++) { /* Get normalized cumulative distribution from prob distribution / - real sum = 0; for (j=0; j<n_categories; j++) { sum += THStorage_(get)( \ @@ -160,7 +160,7 @@ void THTensor_(multinomial)(THLongTensor self, THGenerator _generator, THTenso / update cumulative distribution so that sample cannot be drawn again */ real diff; real new_val = 0; - real sum; if (sample_idx != 0) {	void THTensor_(multinomial)(THLongTensor self, THGenerator _generator, THTenso for (i=0; i<n_dist; i++) { /* Get normalized cumulative distribution from prob distribution / + accreal sum = 0; for (j=0; j<n_categories; j++) { sum += THStorage_(get)( \ @@ -160,7 +160,7 @@ void THTensor_(multinomial)(THLongTensor self, THGenerator _generator, THTenso / update cumulative distribution so that sample cannot be drawn again */ real diff; real new_val = 0; + accreal sum; if (sample_idx != 0) {
105	Tensorflow/Keras	2ccbbdb4b06bf0d60d02c7cf316fce117b77df55	C++	fix	overflow/underflow	softmax output is NaN	overflow/underflow		Direct calculation of the softmax function according to its definition formula is conjugate with numerical issues. Single-precision exp(x) function overflows for x > 89 and underflows for x < −104, and, in turn, cause NaN outputs in the na¨ıve implementations.	activation functions	activation functions	softmax, openGL	use a different algorithm	use a different algorithm	Implement a tree pass softmax algorithm, see algorithm in https://arxiv.org/pdf/2001.04438.pdf					softmax
106	Tensorflow/Keras	115623e2fc21affeaeee5167daec9c1f0db27069	C++	fix	overflow/underflow	softmax output is NaN	overflow/underflow		Direct calculation of the softmax function according to its definition formula is conjugate with numerical issues. Single-precision exp(x) function overflows for x > 89 and underflows for x < −104, and, in turn, cause NaN outputs in the na¨ıve implementations.	activation functions	activation functions	softmax, openCL	use a different algorithm	use a different algorithm	Implement a tree pass softmax algorithm, see algorithm in https://arxiv.org/pdf/2001.04438.pdf					softmax
107	Tensorflow/Keras	e665a737f90564cd143fdc1b15420720596d17e1	C++	fix	underflow					tensor math	statistics	mean test	rewrite math formula	rewrite math formula		auto input_rng = std::bind( - std::uniform_real_distribution<float>(-15.0f, 15.0f), std::ref(rng));	auto input_rng = + std::bind(std::uniform_real_distribution<float>(), std::ref(rng));
108	Tensorflow/Keras	e60c1ba960e598be9c0e0cdd331cdc10e8919dbb	C++	fix	overflow/underflow					activation functions	activation functions	LSTM, logistic function	rewrite math formula	rewrite math formula		XlaOp Logistic(XlaOp x) { - auto half = xla::ScalarLike(x, 0.5); - return half + half * xla::Tanh(half * x); }	XlaOp Logistic(XlaOp x) { + auto one = xla::ScalarLike(x, 1); + return xla::Div(one, (one + xla::Exp(xla::Neg(x)))); }	def testFloatOpsDisabledOnMlirBridge(self): + for dtype in self.float_types: + if dtype != np.float16: + self._assertOpOutputMatchesExpected( + lambda x: math_ops.sigmoid(x) / math_ops.log1p(math_ops.exp(x)), + np.array([-40, 40], dtype=dtype), + expected=np.array([1.0, 0.025], dtype=dtype))
109	Tensorflow/Keras	86fa42f516e4c5ca5ac3b2430aeab9d1a55afb30	python	fix	loss of precision	the output of derivative of betainc is NaN	invalid operation	I = betainc(X,Z,W) computes the incomplete beta function for corresponding elements of the arrays X, Z and W. The elements of X must be in the closed interval . The arrays Z and W must be nonnegative and real. All arrays must be the same size, or any of them can be scalar.	When calculating the derivate of betainc, if a or b are equal to 1, there is a risk that log(0) occurs	gradients/derivatives	derivatives	derivative of Betainc (incomplete beta function)	rewrite math formula	rewrite math formula	Use xlog1py and xlogy instead of log. The function xlog1py computes x * log1p(y) for a given x and y, This function safely returns zero when x = 0, no matter what the value of y is. The function xlogy(x,y) returns 0 if x == 0, and x * log(y) otherwise, elementwise.	partial_x = math_ops.exp((b - 1) * math_ops.log(1 - x) + - (a - 1) * math_ops.log(x) - log_beta)	# We use xlog1py and xlogy since the derivatives should tend to + # zero one one of the tails when a is 1. or b is 1. + partial_x = math_ops.exp(math_ops.xlog1py(b - 1, -x) + + math_ops.xlogy(a - 1, x) - log_beta)			exp, log
110	Tensorflow/Keras	ee85e6d230278e763a2784ba86acc747abdb2242	C++	fix	loss of precision	decreased accuracy				tensor math	statistics	variance	use a different algorithm	use a different algorithm	Use more numerically stable two-pass algorithm to calculate variance in MeanStddevNormalization.	for (int batch = 0; batch < n_batch; ++batch) { float sum = 0.0f; - float sum_sq = 0.0f; for (int i = 0; i < v_size; ++i) { sum += input_vector[i]; - sum_sq += input_vector[i] * input_vector[i]; } const float mean = sum / v_size; - const float variance = sum_sq / v_size - mean * mean;	for (int batch = 0; batch < n_batch; ++batch) { float sum = 0.0f; } const float mean = sum / v_size; - const float variance = sum_sq / v_size - mean * mean; + float sum_diff_sq = 0.0f; + for (int i = 0; i < v_size; ++i) { + const float diff = input_vector[i] - mean; + sum_diff_sq += diff * diff; + } + const float variance = sum_diff_sq / v_size;			variance, sum of squares
111	Tensorflow/Keras	fd2d8bc50e9b3143544819bf505326e4ed6db2a5	C++	fix	overflow/underflow	incorrect result	overflow	XlaOp=Array to concatenate across replicas. asinh(x) = log(x + sqrt(x^2 + 1))	risk of overflow due to x^2 for large x	tensor math	tensor math	inverse hyperbolc sine	rewrite math formula	rewrite math formula	For positive x, we can approximate x + sqrt(x^2 + 1) as 2*x and return log(2) + log(x). For negative x we utilize asinh(-x) = -asinh(x)	XlaOp Asinh(XlaOp x) { return Log(x + Sqrt(x * x + ScalarLike(x, 1.0))); }	XlaOp Asinh(XlaOp x) { + XlaBuilder* b = x.builder(); + auto do_it = [&](XlaOp x) -> StatusOr<XlaOp> { + TF_ASSIGN_OR_RETURN(auto shape, b->GetShape(x)); + auto one = ScalarLike(x, 1); + if (primitive_util::IsComplexType(shape.element_type())) { + return Log(x + Sqrt(x * x + one)); + } + auto a = Abs(x); + auto naive_result = Log(a + Sqrt(a * a + one)); + auto overflow_result = Log(Abs(a)) + Log(ScalarLike(a, 2)); + auto sqrt_max_value = Sqrt(MaxFiniteValue(b, shape.element_type())); + return Sign(x) * + Select(Ge(a, sqrt_max_value), overflow_result, naive_result); + }; + // These upcasts are not strictly necessary on all platforms to get within our + // error tolerances, so we could relax this if it ever mattered. + return DoWithUpcastToF32(x, {BF16, F16}, [&](XlaOp x) { + return b->ReportErrorOrReturn(do_it(x)); + }); +}			log, square root, power
112	Tensorflow/Keras	f84e8257aa88fa45cc7a15835ad386565cd60237	C++	fix	loss of precision			In Eigen, a reduction is a function taking a matrix or array, and returning a single scalar value. One of the most used reductions is .sum() , returning the sum of all the coefficients inside a given matrix or array.		CNN operations	pooling layer	eigen reduction, summation, EigenPooling	use a different algorithm	use a different algorithm	use a tree algorithm for summation					summation
113	Tensorflow/Keras	18f860fd8e1fdffd80633cf5ac32f895423dfa8d	C++	fix	underflow/loss of precision					other	random number generator	testing, random number generation	rewrite math formula	rewrite math formula	change input range for random number generator	std::uniform_real_distribution<FloatT> generator(-0.9f, 1.0f);	std::uniform_real_distribution<FloatT> generator(1.0f, 1.125f);
114	Tensorflow/Keras	35ca57d39b9e368ef43302421db774e4ac3e3625	Python	fix	overflow/underflow		overflow/underflow			statistical distributions	statistical distributions	binomial distribution	rewrite math formula	rewrite math formula	Use log_sigmoid instead of log1p and log. ALso, use logits instead of probabilities	return (self.total_count * math_ops.log1p(-self.probs) - + x * math_ops.log(self.probs))	return (self.total_count * math_ops.log_sigmoid(-self.logits) + + x * math_ops.log_sigmoid(self.logits))	def testLogProbOverflow(self): + with self.test_session() as sess: + logits = np.float32([20., 30., 40.]) + total_count = np.float32(1.) + x = np.float32(0.) + nb = negative_binomial.NegativeBinomial( + total_count=total_count, logits=logits) + log_prob_ = sess.run(nb.log_prob(x)) + self.assertAllEqual(np.ones_like(log_prob_, dtype=np.bool), + np.isfinite(log_prob_)) + + def testLogProbUnderflow(self): + with self.test_session() as sess: + logits = np.float32([-90, -100, -110]) + total_count = np.float32(1.) + x = np.float32(0.) + nb = negative_binomial.NegativeBinomial( + total_count=total_count, logits=logits) + log_prob_ = sess.run(nb.log_prob(x)) + self.assertAllEqual(np.ones_like(log_prob_, dtype=np.bool), + np.isfinite(log_prob_))		log
115	Tensorflow/Keras	2114fd51e9e4fe3cefc058fe42363f68126a9da6	C++	fix	overflow/underflow		overflow/underflow	softplus(x) = log(exp(x) + 1), softplus is a smooth approximation of relu. Like relu, softplus always takes on positive values.		activation functions	activation functions	sofplus	rewrite math formula	rewrite math formula		XLAJIT_MAKE_UNARY(Softplus, - b->Log(b->Add(b->Exp(x), XlaHelpers::One(b, input_type(0)))));	static xla::ComputationDataHandle Softplus( + xla::ComputationBuilder* b, DataType dtype, + const xla::ComputationDataHandle& features) { + xla::ComputationDataHandle threshold = + b->Add(b->Log(XlaHelpers::Epsilon(b, dtype)), + XlaHelpers::FloatLiteral(b, dtype, 2.0)); + // Value above which exp(x) may overflow, but softplus(x) == x + // is within machine epsilon. + xla::ComputationDataHandle too_large = b->Gt(features, b->Neg(threshold)); + // Value below which exp(x) may underflow, but softplus(x) == exp(x) + // is within machine epsilon. + xla::ComputationDataHandle too_small = b->Lt(features, threshold); + xla::ComputationDataHandle features_exp = b->Exp(features); + xla::ComputationDataHandle output = b->Select( + too_large, features, + b->Select(too_small, features_exp, + b->Log(b->Add(features_exp, XlaHelpers::One(b, dtype))))); + return output; +} +XLAJIT_MAKE_UNARY(Softplus, Softplus(b, input_type(0), x));	def _assertSoftplusMatchesExpected(self, features, dtype): + features = np.array(features, dtype=dtype) + zero = np.asarray(0).astype(dtype) + expected = np.logaddexp(zero, features) + self._assertOpOutputMatchesExpected( + nn_ops.softplus, features, expected=expected) + + def testSoftplus(self): + for dtype in self.float_types: + self._assertSoftplusMatchesExpected([[-2, 0, 8]], dtype) + self._assertSoftplusMatchesExpected( + [[-9, 7, -5, 3, -1], [1, -3, 5, -7, 9]], dtype) + log_eps = np.log(np.finfo(dtype).eps) + one = dtype(1) + ten = dtype(10) + self._assertSoftplusMatchesExpected([ + log_eps, log_eps - one, log_eps + one, log_eps - ten, + log_eps + ten, -log_eps, -log_eps - one, -log_eps + one, + -log_eps - ten, -log_eps + ten], dtype)
116	Tensorflow/Keras	448de13b1ae2ebc96a49785cee5ae98db1ae7b06	C++	fix	overflow/underflow					linear algebra	determinant	log determinant of a matrix	use a different algorithm	use a different algorithm	Compute the log determinant through a Partially Pivoted LU decomposition
117	Tensorflow/Keras	1193b39c9e58545ac35aae19dfa34a06bdfae073	Python	fix	underflow			Poisson is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time or space if these events occur with a known constant mean rate and independently of the time since the last event. Lambda can be 0 through infinity.	for a small rate lamdda in poisson distribution, e^(-lambda) causes numerical stability issues, because exp of a very small number produces exponentially smaller number, which leads to a risk of underflow	statistical distributions	statistical distributions	poisson distribution	rewrite math formula	rewrite math formula	Use log of rate instead of plain rate to avoid exponenitating very small numbers	with ops.control_dependencies([check_ops.assert_positive(rate)] if - validate_args else []): - self._rate = array_ops.identity(rate, name="rate")	if (rate is None) == (log_rate is None): + raise ValueError("Must specify exactly one of `rate` and `log_rate`.") + elif log_rate is None: + rate = ops.convert_to_tensor(rate, name="rate") + if not rate.dtype.is_floating: + raise TypeError("rate.dtype ({}) is a not a float-type.".format( + rate.dtype.name)) + with ops.control_dependencies([check_ops.assert_positive(rate)] if + validate_args else []): + self._rate = array_ops.identity(rate, name="rate") + self._log_rate = math_ops.log(rate, name="log_rate") + else: + log_rate = ops.convert_to_tensor(log_rate, name="log_rate") + if not log_rate.dtype.is_floating: + raise TypeError("log_rate.dtype ({}) is a not a float-type.".format( + log_rate.dtype.name)) + self._rate = math_ops.exp(log_rate, name="rate") + self._log_rate = ops.convert_to_tensor(log_rate, name="log_rate") class PoissonLogRateTest(PoissonTest): + + def _make_poisson(self, rate, validate_args=False): + return poisson_lib.Poisson( + log_rate=math_ops.log(rate), + validate_args=validate_args) + + def testInvalidLam(self): + # No need to worry about the non-negativity of `rate` when using the + # `log_rate` parameterization. + pass			exp
118	Tensorflow/Keras	0cff60ebb29f5aba5092988c8b7f13c258115e81	Python	fix	overflow/underflow					linear algebra	linear algebra	log of hermitian matrix determinant	use a different algorithm	use a different algorithm	Use the property that the log det(A) = 2*sum(log(real(diag(C)))), where C is the cholesky decomposition of A. Add a function to compute the natural log of the determinant for hermitian positive definite matrices in a numerically stable way via Cholesky decompositions.. Equivalent to numpy.linalg.slogdet, although no sign is returned since only + hermitian positive definite matrices are supported.		def logdet(matrix, name=None): + """Computes log of the determinant of a hermitian positive definite matrix. + + ```python + # Compute the determinant of a matrix while reducing the chance of over- or + underflow: + A = ... # shape 10 x 10 + det = tf.exp(tf.logdet(A)) # scalar + ``` + + Args: + matrix: A `Tensor`. Must be `float32`, `float64`, `complex64`, or + `complex128` with shape `[..., M, M]`. + name: A name to give this `Op`. Defaults to `logdet`. + + Returns: + The natural log of the determinant of `matrix`. + + @compatibility(numpy) + Equivalent to numpy.linalg.slogdet, although no sign is returned since only + hermitian positive definite matrices are supported. + @end_compatibility + """ + # This uses the property that the log det(A) = 2sum(log(real(diag(C)))) + # where C is the cholesky decomposition of A. + with ops.name_scope(name, 'logdet', [matrix]): + chol = gen_linalg_ops.cholesky(matrix) + return 2.0 math_ops.reduce_sum( + math_ops.log(math_ops.real(array_ops.matrix_diag_part(chol))), + reduction_indices=[-1])
119	Tensorflow/Keras	b85601b95eba28605d3de076fa70cabf2f2e32b9	Python	fix	loss of precision	incorrect result		In probability theory, an ƒ-divergence is a function Df (P \|\| Q) that measures the difference between two probability distributions P and Q.	If probability distribution Q is not reparameterized, TensorFlow's gradient will be incorrect since the chain-rule stops at samples of unreparameterized distributions	other	probability	ƒ-divergence	use a different algorithm	use a different algorithm	Improve score-trick to be a valid Csiszar f-Divergence yet numerically stable. Using the Score-Gradient trick results in an unbiased gradient	nabla E_q[f(X)] - = nabla int dx q(x) f(x) - = int dx nabla [ q(x) f(x) ] - = int dx q'(x) f(x) + q(x) f'(x) - = int dx q(x) nabla [ log(q(x)) stopgrad[f(x)] + f(x) ] - = E_q[ nabla [ log(q(X)) stopgrad[f(X)] + f(X) ] ] - ~= Avg{ log(q(y_i)) stopgrad[f(y_i)] + f(y_i) : y_i = stopgrad[x_i], x_i ~ q}	grad[ E_q[f(X)] ] + = grad[ int dx q(x) f(x) ] + = int dx grad[ q(x) f(x) ] + = int dx [ q'(x) f(x) + q(x) f'(x) ] + = int dx q(x) grad[ f(x) q(x) / stop_grad[q(x)] ] + = E_q[ grad[ f(x) q(x) / stop_grad[q(x)] ] ]
120	Tensorflow/Keras	e6126230200e2ce9c96da5c9e4dc7f104c645d11	Python	fix	overflow/underflow		overflow/underflow		for very small or very large numbers naive direct computation of log of sum of exponentials has a risk of underflow and overflow respectively	other	other	Gaussian mixture model, log probability	rewrite math formula	rewrite math formula	Use Tensorflow log(sum(exp)) function to work in log scale which is numerically stable than log -> sum ->exp for calculating log probability	self._prior_probs[shard_id] = math_ops.log( - math_ops.reduce_sum( - math_ops.exp(self._probs[shard_id]), 1, keep_dims=True))	self._prior_probs[shard_id] = math_ops.reduce_logsumexp( + self._probs[shard_id], axis=1, keep_dims=True)	def test_random_input_large(self): + # sklearn version. + iterations = 5 # that should be enough to know whether this diverges + np.random.seed(5) + num_classes = 20 + x = np.array([[np.random.random() for _ in range(100)] + for _ in range(num_classes)], dtype=np.float32) + + # skflow version. + gmm = gmm_lib.GMM(num_classes, + covariance_type='full', + config=run_config.RunConfig(tf_random_seed=2)) + + def get_input_fn(x): + def input_fn(): + return constant_op.constant(x.astype(np.float32)), None + return input_fn + + gmm.fit(input_fn=get_input_fn(x), steps=iterations) + self.assertFalse(np.isnan(gmm.clusters()).any())		log sum of exp
121	Tensorflow/Keras	fdbd02c8d7f07bd1207938662716fad8857dcd55	Python	fix	loss of precision				deals with the shift parameter, but this feature is not available in TF now	tensor math	statistics	mean, variance	rewrite math formula	rewrite math formula	change the shift value for calculating mean	shift = math_ops.cast(shift, dtypes.float32) if ( - shift is not None and x.dtype == dtypes.float16) else shift	if shift is None: + # Compute true mean while keeping the dims for proper broadcasting. + shift = array_ops.stop_gradient( + math_ops.reduce_mean(y, axes, keep_dims=True)) + else: + shift = math_ops.cast(shift, y.dtype) + # Reshape shift as needed. + shift = array_ops.reshape(shift, array_ops.shape(m_ss)) + shift.set_shape(m_ss.get_shape())
122	Tensorflow/Keras	7c97f13ace37ac73bb820dec941c55ae4d538581	Python	fix	underflow			Student's t-distribution is defined as the distribution of the random variable t which is (very loosely) the "best" that we can do not knowing sigma.		statistical distributions	statistical distributions	student t distribution log probability	rewrite math formula	rewrite math formula	use log1p instead of log. THe function log1p computes natural logarithm of (1 + x) element-wise.	def _log_prob(self, x): y = (x - self.mu) / self.sigma half_df = 0.5 * self.df return (math_ops.lgamma(0.5 + half_df) - math_ops.lgamma(half_df) - 0.5 * math_ops.log(self.df) - 0.5 * math.log(math.pi) - math_ops.log(self.sigma) - (0.5 + half_df) * math_ops.log(1. + math_ops.square(y) / self.df))	def _log_prob(self, x): return self._log_unnormalized_prob(x) - self._log_normalization() def _log_unnormalized_prob(self, x): y = (x - self.mu) / self.sigma # Abs(sigma) superfluous. return -0.5 * (self.df + 1.) * math_ops.log1p(y*2. / self.df) def _log_normalization(self): return (math_ops.log(math_ops.abs(self.sigma)) + 0.5 math_ops.log(self.df) + 0.5 * np.log(np.pi) + math_ops.lgamma(0.5 * self.df) - math_ops.lgamma(0.5 * (self.df + 1.)))			log
123	Tensorflow/Keras	de6ce1de08ea97d599687fbbe5196ca4af5232ae	C++	fix	overflow				large logit values were not properly handled in multinomial distribution	statistical distributions	statistical distributions	Multinomial distribution	rewrite math formula	rewrite math formula	subtract a maximum from logits before taking exponentials	running_total += std::exp(static_cast<float>(logits_row[j]))	// Takes an along-class maximum (for numerical stability). + T max = std::numeric_limits<T>::lowest(); + for (int64 j = 0; j < num_classes; ++j) { + if (std::isfinite(static_cast<float>(logits_row[j]))) { + max = std::max(max, logits_row[j]); + } + } + const float max_logit = static_cast<float>(max); running_total += std::exp(static_cast<float>(logits_row[j]) - max_logit);	def testLargeLogits(self): + for neg in [True, False]: + with self.test_session(use_gpu=self.use_gpu): + logits = np.array([[1000.] * 5]) + if neg: + logits *= -1 + samples = tf.multinomial(logits, 10).eval() + # Sampled classes should be in-range. + self.assertTrue((samples >= 0).all()) + self.assertTrue((samples < 5).all())		exponential
124	Tensorflow/Keras	e47dc8593d11be8cd82767965b8b75b6307c07e4	Python	fix	loss of precision				There is evidence that the 'shift' strategy in computing the sufficient statistics of the moments is actually leading to worse numerical stability for batch normalization.	tensor math	statistics	mean, variance	other	amend algorithm	set shift parameter as a non-default argument in moments method that calculates mean and variance and is utilized in batch normalization	def sufficient_statistics(x, axes, shift=True, keep_dims=False, name=None)	def sufficient_statistics(x, axes, shift=False, keep_dims=False, name=None)			variance, mean
125	Tensorflow/Keras	ab1165c4908b70441f1ddea24821a8b84a806ddc	C++	fix	overflow/underflow			Legalization is the phase in code generation that eradicates any instructions that are not supported by the target.	Multi-Level IR Compiler Framework	activation functions	activation functions	sigmoid, compiler	other	amend algorithm	This function converts Sigmoid op to HLO ops computing sigmoid	class ConvertSigmoidOp : public OpRewritePattern<TF::SigmoidOp> { - using OpRewritePattern::OpRewritePattern; - - LogicalResult matchAndRewrite(TF::SigmoidOp op,	class ConvertSigmoidOp : public RewritePattern { public: + explicit ConvertSigmoidOp(MLIRContext context) + : RewritePattern( + TF::SigmoidOp::getOperationName(), 0, context, + {mhlo::ConstOp::getOperationName(), + shape::ShapeOfOp::getOperationName(), + shape::ToExtentTensorOp::getOperationName(), + mhlo::DynamicBroadcastInDimOp::getOperationName(), + mhlo::MulOp::getOperationName(), mhlo::TanhOp::getOperationName(), + mhlo::AddOp::getOperationName()}) {} + + LogicalResult matchAndRewrite(Operation sigmoid_op, PatternRewriter &rewriter) const override { + auto op = cast<TF::SigmoidOp>(sigmoid_op);
126	Tensorflow/Keras	6acd86d539464b611d37b8dc13251fafab25fb5c	C++	fix	loss of precision					tensor math	tensor math	argmin	rewrite math formula	rewrite math formula	amend logic for tie breaking
127	Tensorflow/Keras	f73e9d61a7c577a5182701d3aa5bba8d6d69f87d	C++	fix	loss of precision					tensor math	tensor math	argmin, argmax	rewrite math formula	rewrite math formula	amend logic for tie breaking
128	Tensorflow/Keras	ee85e6d230278e763a2784ba86acc747abdb2242	C++	fix	loss of precision				MeanStddevNormalization is numerically unstable	tensor math	statistics	variance	use a different algorithm	use a different algorithm	Use the numerically stable two-pass algorithm to calculate variance in MeanStddevNormalization.	float sum_sq = 0.0f; sum_sq += input_vector[i] * input_vector[i]; } const float variance = sum_sq / v_size - mean * mean;	float sum_diff_sq = 0.0f; + for (int i = 0; i < v_size; ++i) { + const float diff = input_vector[i] - mean; + sum_diff_sq += diff * diff; + } + const float variance = sum_diff_sq / v_size;	test accuracy for // small mean, small variance / small mean, large variance // large mean, zero variance // large mean, small variance / large mean, large variance
129	Tensorflow/Keras	f42d9846f6942e497645af28b3506e6163bdc8bf	C++	fix	underflow			mel spectrogram is a spectrogram where the frequencies are converted to the mel scale. It is used in signal processing and it involves mapping audio signal from the time to frequency domain using fast fourier transform		loss functions	loss functions	logistic loss, uniform distribution sampling, Mel-Frequency Cepstral Coefficient (MFCC) calculation	use a different algorithm	use a different algorithm	Replace log(1 + x) with numerically more stable log1p(x)	LogisticLossUpdater : public DualLossUpdater { - return log(1 + exp(-y_wx)) * example_weight; double MfccMelFilterbank::FreqToMel(double freq) const { - return 1127.0 * log(1.0 + (freq / 700.0)); LogUniformSampler::LogUniformSampler(int64 range) - : RangeSampler(range), log_range_(log(range + 1)) {} static float FreqToMel(float freq) { - return 1127.0 * log(1.0 + (freq / 700.0)); double MfccMelFilterbank::FreqToMel(double freq) const { - return 1127.0 * log(1.0 + (freq / 700.0));	class LogisticLossUpdater : public DualLossUpdater { + return log1p(exp(-y_wx)) * example_weight; double MfccMelFilterbank::FreqToMel(double freq) const { + return 1127.0 * log1p(freq / 700.0); LogUniformSampler::LogUniformSampler(int64 range) + : RangeSampler(range), log_range_(log1p(range)) {} static float FreqToMel(float freq) { + return 1127.0 * log1p(freq / 700.0); double MfccMelFilterbank::FreqToMel(double freq) const { + return 1127.0 * log1p(freq / 700.0);
130	Tensorflow/Keras	0fe671dd0a14614edbbd50397777def3bff770cc	Cuda	fix	loss of precision				Eigen MeanReducer is numerically unstable due to unstable summation operation. Summing numbers of different magnitude leads to loss of precision, numbers should be sorted	tensor math	statistics	mean	use a different algorithm	use a different algorithm	Don't use the numerically unstable MeanReducer class in Eigen.
131	Tensorflow/Keras	f84e8257aa88fa45cc7a15835ad386565cd60237	C++	fix	loss of precision				Eigen MeanReducer is numerically unstable due to unstable summation operation. Summing numbers of different magnitude leads to loss of precision, numbers should be sorted	CNN operations	pooling layer	mean, average pooling	use a different algorithm	use a different algorithm	Change the Eigen reduction code to use a tree to improve numerical stability. This changes the InnerMostDimReducer to use a summation tree, which is more numerically stable than the previous approach of sequential addition into an accumulator. This solves the issue for reduction over all or a trailing subset of dimensions. This change does not improve the numerical accuracy for MeanReducer, which maintains state. Benchmarks show a 40% (AVX) to 50% (SSE) slowdown for small row reductions (sum, float). column- and full reductions are unchanged.
132	Tensorflow/Keras	fa2132ab65f92ea40c94152dba105a9f86a0a555	Python	fix	loss of precision				unsorted sum is numerically unstable	gradients/derivatives	gradients	gradients, hessians, boosted trees	increase variable precision/change variable type	increase variable precision	Use 64bit aggregation for gradients and hessians since the 32 bit version is numerically unstable for large minibatches.	per_partition_hessians = math_ops.unsorted_segment_sum( - hessians, mapped_partitions, array_ops.size(unique_partitions))	# Since unsorted_segment_sum can be numerically unstable, use 64bit + # operation. + gradients64 = math_ops.cast(gradients, dtypes.float64) + hessians64 = math_ops.cast(hessians, dtypes.float64) per_partition_gradients = math_ops.unsorted_segment_sum( + gradients64, mapped_partitions, array_ops.size(unique_partitions)) per_partition_hessians = math_ops.unsorted_segment_sum( + hessians64, mapped_partitions, array_ops.size(unique_partitions)) + per_partition_gradients = math_ops.cast(per_partition_gradients, + dtypes.float32) + per_partition_hessians = math_ops.cast(per_partition_hessians, + dtypes.float32)			sum
133	Tensorflow/Keras	48adc7ba73177f2a9331918b160bc3d0775985b8	Python	fix	underflow				square root	linear algebra	norm	L2 norm	rewrite math formula	rewrite math formula	Avoid potentially numerically unstable square root in the linalg_ops.norm()	mean = math_ops.square(linalg_ops.norm(m - m_w)) # This uses the L2 norm.	mean = math_ops.reduce_sum( + math_ops.squared_difference(m, m_w)) # Equivalent to L2 but more stable.			square root, mean
134	Tensorflow/Keras	18f860fd8e1fdffd80633cf5ac32f895423dfa8d	C++	fix	underflow			In computing, a normal number is a non-zero number in a floating-point representation which is within the balanced range supported by a given floating-point format: it is a floating point number that can be represented without leading zeros in its significand.		other	random number generator	random number generator	limit input range	limit input range	Create uniform numbers between 1 and 1.125 instead of -0.9 and 1.0 to avoid creating denormal numbers.	std::uniform_real_distribution<FloatT> generator(-0.9f, 1.0f); [&](tensorflow::gtl::ArraySlice<int64> /indices/) { - return generator(engine);	std::uniform_real_distribution<FloatT> generator(1.0f, 1.125f); [&](tensorflow::gtl::ArraySlice<int64> indices) { + // Generate a random uniforma number from -0.0625 and 0.0625 and bias it + // with a position dependent nubmer with mean 0.037109375. These number + // should allow for long chains of accumulation without being too close + // to zero or to large to accumulate all numbers accurately. + return (generator(engine) - 1.0625) + + static_cast<FloatT>(Product(indices) % 113 - 47) / + static_cast<FloatT>(256.0f);
135	Tensorflow/Keras	6db014b44863bab616f026beab461fd646fcb505	C++	fix	overflow/underflow					gradients/derivatives	gradients	gradients testing	other	revert commit		N/A	TEST_F(NaryGradTest, Select) { + TensorShape shape({3, 2}); + auto x1 = Placeholder(scope_, DT_FLOAT, Placeholder::Shape(shape)); + auto x2 = Placeholder(scope_, DT_FLOAT, Placeholder::Shape(shape)); + // Use constant values to avoid instability when computing + Tensor c = + test::AsTensor<float>({-3.5f, 1.5f, -1.2f, 3.0f, -2.5f, 2.8f}, {3, 2}); + auto zero = Cast(scope_, Const(scope_, 0.0), c.dtype()); + auto y = Where3(scope_, Greater(scope_, c, zero), x1, x2); + RunTest({x1, x2}, {shape, shape}, {y}, {shape}); +} Status SelectGrad(const Scope& scope, const Operation& op, + const std::vector<Output>& grad_inputs, + std::vector<Output>* grad_outputs) { + auto comparator = op.input(0); + auto x = op.input(1); + auto zeros = ZerosLike(scope, x); + auto grad = grad_inputs[0]; + + auto gx_1 = Where3(scope, comparator, grad, zeros); + auto gx_2 = Where3(scope, comparator, zeros, grad); + + grad_outputs->push_back(NoGradient()); + grad_outputs->push_back(gx_1); + grad_outputs->push_back(gx_2); + return scope.status(); +} +REGISTER_GRADIENT_OP("Select", SelectGrad);
136	Tensorflow/Keras	1bbec9e4e9c5d3fbbc2fa2b58841435e86dbf76a	Cuda	fix	overflow					linear algebra	linear algebra	log determinant	use a different algorithm	use a different algorithm	Compute Determinant from a partially pivoted LU factorization Change behavior for Determinant on matrices with (numerically) infinite determinants to match the behavior of numpy.linalg.det: Return inf for matrix with infinite determinant.
137	Tensorflow/Keras	265483857be3ca84b992937490ea8f0591b2d4ab	Python	fix	overflow/underflow					statistical distributions	statistical distributions	laplace distribution	use a different algorithm	use a different algorithm	Add more stable calculation of Log of the cumulative distribution function and log survival function
138	Tensorflow/Keras	e37e792d3eb2dac7ac627b7d8d56d69360649d19	Python	fix	loss of precision				The raw formulation of cross-entropy, tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(tf.softmax(y)), reduction_indices=[1])) can be numerically unstable.	loss functions	loss functions	cross entropy loss	rewrite math formula	rewrite math formula	we apply +`tf.nn.softmax_cross_entropy_with_logits` on the unnormalized logits (e.g., we +call `softmax_cross_entropy_with_logits` on `tf.matmul(x, W) + b`), because this +more numerically stable function internally computes the softmax activation.	cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))	cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
139	Tensorflow/Keras	a89c54d57209f91161fa450605f645c9124d89ac	Python	fix	loss of precision					statistical distributions	statistical distributions	Bernoulli distribution	use a different algorithm	use a different algorithm	use logits to create bernouli distribution
140	Tensorflow/Keras	14066c4b84e56c3b86f6152de1bb80df22341aa8	Python	fix	overflow/underflow					statistical distributions	statistical distributions	log determinant, multivariate normal distribution	use a different algorithm	use a different algorithm	Compute log_determinant instead of determinant in mvn to make stable (w.r.t. under/over flow).	-def _determinant_from_sigma_chol(sigma_chol): det_last_dim = array_ops.rank(sigma_chol) - 2 sigma_batch_diag = array_ops.batch_matrix_diag_part(sigma_chol) - det = math_ops.square(math_ops.reduce_prod( - sigma_batch_diag, reduction_indices=det_last_dim)) - det.set_shape(sigma_chol.get_shape()[:-2]) - return det	def _log_determinant_from_sigma_chol(sigma_chol): det_last_dim = array_ops.rank(sigma_chol) - 2 sigma_batch_diag = array_ops.batch_matrix_diag_part(sigma_chol) + log_det = 2.0 * math_ops.reduce_sum( + math_ops.log(sigma_batch_diag), reduction_indices=det_last_dim) + log_det.set_shape(sigma_chol.get_shape()[:-2]) + return log_det
141	Tensorflow/Keras	bce6216610d57f8f4b1e9e79836737df109c4e42	Python	fix	loss of precision					tensor math	statistics	variance with shifted data	use a different algorithm	use a different algorithm
142	Tensorflow/Keras	66f452d2217b155b697fc6d6cef5f56599ee2bbc	C++	fix	overflow		overflow		Only enable the HoistCommonFactorOutOfAggregation rewrite in aggressive mode, since it changes program behavior w.r.t. over- and underflow. For example, it will rewrite "0.5x + 0.5y" to "0.5(x + y)", which will overflow if x + y > FLT_MAX, while the original expression does not overflow unless x + y > 2FLT_MAX.	optimizers	optimizers	arithmetic optimizer	other	allow code rewriting only in agressive mode	Only enable the HoistCommonFactorOutOfAggregation rewrite in aggressive mode
143	Tensorflow/Keras	12243e6b65958c2e0c344aa3df4875f472ce5ae0	C++	fix	overflow		overflow			other	computational graph	analytical cost estimator	add overflow check	add overflow check	Fix integer-overflow in `tensorflow::grappler::AnalyticalCostEstimator::PredictCosts` by using MultiplyWithoutOverflow. MultiplyWithoutOverflow multiplies unsigned ints since signed overflow is undefined and has a check fo integer overflow. Return nullop if overflow	size *= std::max<int64>(1, dim.size());	size = MultiplyWithoutOverflow(size, std::max<int64>(1, dim.size())); if (size < 0) { return errors::InvalidArgument( "Integer overflow encountered in dimension size."); }
144	Tensorflow/Keras	cc464f04caa327d3f62d2f793a428cb7b0f0a5d7	Python	unit test	overflow		overflow			linear algebra	linear algebra	array product	limit input range	limit input range	Limit input values to avoid integer overflow in reduction_ops_test.	# overflow, divide the incremental int32 array by 2. - for rank in range(1, _MAX_RANK + 1): - np_arr = self._makeIncremental((2,) * rank, dtypes.int32) / 2 def testInt64(self): - for rank in range(1, _MAX_RANK + 1): - np_arr = self._makeIncremental((2,) * rank, dtypes.int64)	# overflow, limit array values. + for rank in range(1, _MAX_RANK): + np_arr = self._makeIncremental((2,) * rank, dtypes.int32) % 5 + 1 def testInt64(self): + for rank in range(1, _MAX_RANK): + # Avoid overflow by limiting array values. + np_arr = self._makeIncremental((2,) * rank, dtypes.int64) % 11 + 1
145	Tensorflow/Keras	9d40a1573849b7e21d4f2d359fd9e87c40e33c0e	Python	Disable test	overflow		overflow			tensor math	tensor math	division, mod testing	disable test/warning	disable test for division and mod	Temporarily disable div overflow edge case due to ASAN failure.
146	Tensorflow/Keras	b47be308c4b5ac7babd6400a8fb40c3d8bf163d6	C++	fix	overflow		overflow		The original implementations of `google_floor_div`, XLA `FloorDiv` and MLIR `TF_FloorDivOp` all suffered from overflows for `abs(x) + abs(y) > INT_MAX	tensor math	tensor math	floor division	rewrite math formula	rewrite math formula	Rewrite formula to T z = x / y return (z * y != x && (x < 0) != (y < 0)) ? z - 1 : z			def intEdgeTestData(self, dtype): + """Edge-case test data for integer types.""" + nums = np.array([np.iinfo(dtype).min, -1, 1, + np.iinfo(dtype).max], + dtype=dtype).reshape([4, 1]) + divs = nums.reshape([1, 4]) + return nums, divs + + def testFloorDivModIntEdges(self): + for dtype in [np.int32, np.int64]: + x, y = self.intEdgeTestData(dtype) + tf_floor_div = math_ops.floor_div(x, y) + np_floor_div = self.numpySafeFloorDivInt(x, y) + self.assertAllEqual(tf_floor_div, np_floor_div) + tf_floor_mod = math_ops.floormod(x, y) + np_floor_mod = self.numpySafeFloorModInt(x, y) + self.assertAllEqual(tf_floor_mod, np_floor_mod) + z = math_ops.add(math_ops.multiply(tf_floor_div, y), tf_floor_mod) + # x = floor_div(x, y) * y + floor_mod(x, y) + self.assertAllEqual(z, np.broadcast_to(x, z.shape)) + + def testTruncateDivModIntEdges(self): + for dtype in [np.int32, np.int64]: + x, y = self.intEdgeTestData(dtype) + tf_truncate_div = math_ops.truncatediv(x, y) + np_truncate_div = self.numpySafeTruncateDivInt(x, y) + self.assertAllEqual(tf_truncate_div, np_truncate_div) + tf_truncate_mod = math_ops.truncatemod(x, y) + np_truncate_mod = self.numpySafeTruncateModInt(x, y) + self.assertAllEqual(tf_truncate_mod, np_truncate_mod) + z = math_ops.add(math_ops.multiply(tf_truncate_div, y), tf_truncate_mod) + # x = truncatediv(x, y) * y + truncatemod(x, y) + self.assertAllEqual(z, np.broadcast_to(x, z.shape))
147	Tensorflow/Keras	4c0ee937c0f61c4fc5f5d32d9bb4c67428012a60	C++	fix	overflow		overflow			other	sparse operations	sparse operations	use a different algorithm	use a different algorithm	Prevent overflow by constructing the dense shape separately	sparse::SparseTensor sparse_tensor; OP_REQUIRES_OK(context, - sparse::SparseTensor::Create( - input_indices, input_values, - TensorShape(input_shape.vec<int64>()), &sparse_tensor));	TensorShape dense_shape; + const auto input_shape_flat = input_shape.flat<int64>(); + for (int i = 0; i < input_shape.NumElements(); i++) { + OP_REQUIRES_OK(context, + dense_shape.AddDimWithStatus(input_shape_flat(i))); + } sparse::SparseTensor sparse_tensor; OP_REQUIRES_OK(context, TensorShape(input_shape.vec<int64>()), &sparse_tensor)); + sparse::SparseTensor::Create(input_indices, input_values, + dense_shape, &sparse_tensor));
148	Tensorflow/Keras	7c8cc4ec69cd348e44ad6a2699057ca88faad3e5	C++	fix	overflow		overflow	Op that looks up items from a sparse tensor in an embedding matrix. The sparse lookup tensor is represented by three individual tensors: lookup, indices, and dense_shape.	integer overflow	other	sparse operations	sparse operations, embedding	add overflow check	add overflow check	ensure that output is not a null pointer that indicates overflow	N/A	TF_LITE_ENSURE(context, output_shape != nullptr);
149	Tensorflow/Keras	37054f9134af917ded7f40c7d663fa490d85c7d4	C++	fix	overflow		overflow			activation functions	activation functions	range of activation function, quantization	add overflow check	add overflow check	Add extra robustness by adding more overflow checks to CalculateActivationRangeQuantized for cases where output tensor has bad, but still valid quantization parameters, which cause integer overflow.	void CalculateActivationRangeQuantizedImpl(TfLiteFusedActivation activation, - int32_t qmin, int32_t qmax, - TfLiteTensor* output, - int32_t* act_min, int32_t* act_max) { if (activation == kTfLiteActRelu) { - act_min = std::max(qmin, quantize(0.0)); } else if (activation == kTfLiteActRelu6) { - act_min = std::max(qmin, quantize(0.0)); - act_max = std::min(qmax, quantize(6.0)); } else if (activation == kTfLiteActReluN1To1) { - act_min = std::max(qmin, quantize(-1.0)); - *act_max = std::min(qmax, quantize(1.0));	inline TfLiteStatus Quantize(TfLiteContext* context, float scale, + int32_t zero_point, float f, int32_t& q) { + const float tmp = TfLiteRound(f / scale); + const bool no_integer_overflow_from_quantization = + (tmp >= std::numeric_limits<int32_t>::min() && + tmp <= std::numeric_limits<int32_t>::max()); + TF_LITE_ENSURE(context, no_integer_overflow_from_quantization); + q = zero_point + static_cast<int32_t>(tmp); + return kTfLiteOk; +} + +TfLiteStatus CalculateActivationRangeQuantizedImpl( + TfLiteContext* context, TfLiteFusedActivation activation, int32_t qmin, + int32_t qmax, TfLiteTensor* output, int32_t* act_min, int32_t* act_max) { + int32_t tmp_q; if (activation == kTfLiteActRelu) { + TF_LITE_ENSURE_OK(context, + Quantize(context, scale, zero_point, 0.0, tmp_q)); + act_min = std::max(qmin, tmp_q); act_max = qmax; } else if (activation == kTfLiteActRelu6) { + TF_LITE_ENSURE_OK(context, + Quantize(context, scale, zero_point, 0.0, tmp_q)); + act_min = std::max(qmin, tmp_q); + TF_LITE_ENSURE_OK(context, + Quantize(context, scale, zero_point, 6.0, tmp_q)); + act_max = std::min(qmax, tmp_q); } else if (activation == kTfLiteActReluN1To1) { + TF_LITE_ENSURE_OK(context, + Quantize(context, scale, zero_point, -1.0, tmp_q)); + act_min = std::max(qmin, tmp_q); + TF_LITE_ENSURE_OK(context, + Quantize(context, scale, zero_point, 1.0, tmp_q)); + act_max = std::min(qmax, tmp_q); } else { act_min = qmin; act_max = qmax; } + return kTfLiteOk;	TEST_F(KernelUtilTest, ActivationRangeQuantizedOverflow) { + // Create output. + TfLiteTensor output = {}; + output.type = kTfLiteUInt8; + output.allocation_type = kTfLiteArenaRw; + output.dims = nullptr; + TfLiteQuantizationParams output_quant = {1e-10, -128}; + output.params = output_quant; + output.quantization.type = kTfLiteAffineQuantization; + auto* output_params = reinterpret_cast<TfLiteAffineQuantization>( + malloc(sizeof(TfLiteAffineQuantization))); + output_params->scale = TfLiteFloatArrayCreate(1); + output_params->scale->data[0] = 1; + output_params->zero_point = TfLiteIntArrayCreate(1); + output_params->zero_point->data[0] = -128; + output.quantization.params = reinterpret_cast<void>(output_params); + + // For bounded activation, a too small scale value may cause overflow. + // Make sure overflow error is handled gracefully. + int32_t act_min, act_max; + ASSERT_EQ(kTfLiteOk, + CalculateActivationRangeQuantized(&context_, kTfLiteActRelu, + &output, &act_min, &act_max)); + ASSERT_NE(kTfLiteOk, + CalculateActivationRangeQuantized(&context_, kTfLiteActRelu6, + &output, &act_min, &act_max)); + EXPECT_TRUE(absl::StrContains( + context_.error, "no_integer_overflow_from_quantization was not true")); + ASSERT_NE(kTfLiteOk, + CalculateActivationRangeQuantized(&context_, kTfLiteActReluN1To1, + &output, &act_min, &act_max)); + EXPECT_TRUE(absl::StrContains( + context_.error, "no_integer_overflow_from_quantization was not true")); + + // Release. + TfLiteTensorFree(&output);
150	Tensorflow/Keras	4253f96a58486ffe84b61c0415bb234a4632ee73	C++	fix	overflow		overflow		integer overflow	other	other	concatenate	limit input range	limit input range			TfLiteStatus Prepare(TfLiteContext* context, TfLiteNode* node) { TF_LITE_ENSURE_EQ(context, t->type, input_type); for (int d = 0; d < t0->dims->size; ++d) { if (d == axis) { + // Avoid integer overflow in sum_axis below + TF_LITE_ENSURE(context, t->dims->data[axis] >= 0); + TF_LITE_ENSURE(context, t->dims->data[axis] <= + std::numeric_limits<int>::max() - sum_axis);			sum
151	Tensorflow/Keras	704866eabe03a9aeda044ec91a8d0c83fc1ebdbe	C++	fix	overflow		overflow			other	other	join segments	limit input range	limit input range			OP_REQUIRES(context, num_segments_tensor.NumElements() != 0, + errors::InvalidArgument("Number of segments cannot be empty."));
152	Tensorflow/Keras	87cf4d3ea9949051e50ca3f071fc909538a51cd0	C++	fix	overflow		overflow			other	sparse operations	sparse operations (concat)	add overflow check	add overflow check	break if overflow occured		bool overflow_ocurred = false; for (int i = 0; i < N; i++) { + int64 new_num_elements = 1; OP_REQUIRES(context, TensorShapeUtils::IsVector(shapes[i].shape()), errors::InvalidArgument( "Input shapes should be a vector but received shape ", shapes[i].shape().DebugString(), " at position ", i)); + auto input_shape_vector = shapes[i].vec<int64>(); + for (int j = 0; j < input_shape_vector.size(); j++) { + new_num_elements = + MultiplyWithoutOverflow(new_num_elements, input_shape_vector(j)); + if (new_num_elements < 0) { + overflow_ocurred = true; + break; + } + } + + if (overflow_ocurred) { + break; + } } + OP_REQUIRES( + context, !overflow_ocurred, + errors::Internal("Encountered overflow from large input shape."));
153	Tensorflow/Keras	b432a38fe0e1b4b904a6c222cbce794c39703e87	C++	unit test	overflow		overflow			data processing	image processing	bounding boxes for image processing	limit input range	limit input range	replace check with require valid arguments into draw bounding box function and add useful error messages if argument invalid	CHECK_GE(min_box_row_clamp, 0); - CHECK_GE(max_box_row_clamp, 0); - CHECK_LT(min_box_row_clamp, height); - CHECK_LT(max_box_row_clamp, height); - CHECK_GE(min_box_col_clamp, 0); - CHECK_GE(max_box_col_clamp, 0); - CHECK_LT(min_box_col_clamp, width); - CHECK_LT(max_box_col_clamp, width); - CHECK_LT(min_box_row, height); - CHECK_GE(max_box_row, 0); - CHECK_LT(min_box_col, width); - CHECK_GE(max_box_col, 0);	OP_REQUIRES( + context, min_box_row_clamp >= 0, + errors::InvalidArgument("Min box row clamp is less than 0.")); + OP_REQUIRES( + context, max_box_row_clamp >= 0, + errors::InvalidArgument("Max box row clamp is less than 0.")); + OP_REQUIRES(context, min_box_row_clamp <= height, + errors::InvalidArgument( + "Min box row clamp is greater than height.")); + OP_REQUIRES(context, max_box_row_clamp <= height, + errors::InvalidArgument( + "Max box row clamp is greater than height.")); + + OP_REQUIRES( + context, min_box_col_clamp >= 0, + errors::InvalidArgument("Min box col clamp is less than 0.")); + OP_REQUIRES( + context, max_box_col_clamp >= 0, + errors::InvalidArgument("Max box col clamp is less than 0.")); + OP_REQUIRES(context, min_box_col_clamp <= width, + errors::InvalidArgument( + "Min box col clamp is greater than width.")); + OP_REQUIRES(context, max_box_col_clamp <= width, + errors::InvalidArgument( + "Max box col clamp is greater than width.")); + OP_REQUIRES( + context, min_box_row <= height, + errors::InvalidArgument("Min box row is greater than height.")); + OP_REQUIRES(context, max_box_row >= 0, + errors::InvalidArgument("Max box row is less than 0.")); + OP_REQUIRES( + context, min_box_col <= width, + errors::InvalidArgument("Min box col is greater than width.")); + OP_REQUIRES(context, max_box_col >= 0, + errors::InvalidArgument("Max box col is less than 0."));
154	Tensorflow/Keras	87d2b9751513253058be671313db3e32cc13842a	C++	unit test	overflow		overflow			other	sparse operations	sparse operations (concat)	fix test/warning	fix overflow check		shapes.size())); - bool overflow_ocurred = false; for (int i = 0; i < N; i++) { - int new_num_elements = 1; OP_REQUIRES(context, TensorShapeUtils::IsVector(shapes[i].shape()), errors::InvalidArgument( "Input shapes should be a vector but received shape ", shapes[i].shape().DebugString(), " at position ", i)); - auto input_shape_vector = shapes[i].vec<int64>(); - for (int j = 0; j < input_shape_vector.size(); j++) { - new_num_elements = - MultiplyWithoutOverflow(new_num_elements, input_shape_vector(j)); - if (new_num_elements < 0) { - overflow_ocurred = true; - break; - } - } - - if (overflow_ocurred) { - break; - } } - OP_REQUIRES( - context, !overflow_ocurred, - errors::Internal("Encountered overflow from large input shape."));	N/A
155	Tensorflow/Keras	7bb2d255e6d404cbfa528d0ffc2f22248e6c1b21	Cuda	fix	overflow		overflow			other	sparse operations	sparse to dense operation	increase variable precision/change variable type	change variable type	use int 64 as index	Index output_idx = indices[thread_idx * ndims + ndims - 1];	int64 output_idx = indices[thread_idx * ndims + ndims - 1];
156	Tensorflow/Keras	dc4d330cfe25bbb0c3e4759dadfb16d4715f338a	C++	fix	underflow		underflow		If a complex value's squared norm was denormal but had a non-zero imaginary part, the Householder reflection computation could yield NaNs.	linear algebra	norm	norm	use a different algorithm	use a different algorithm	By using a more accurate norm, we can avoid the underflow. The new norm Computes sqrt(x^2 + y^2 + ...), avoiding overflow/underflow	auto mu = Sqrt(Real(alpha * Conj(alpha)) + sigma);	XlaOp Norm(std::vector<XlaOp> xs) { + CHECK(!xs.empty()); + XlaOp w; + for (size_t i = 0; i < xs.size(); ++i) { + xs[i] = Abs(xs[i]); + w = i == 0 ? xs[i] : xla::Max(w, xs[i]); + } + + XlaOp out; + for (size_t i = 0; i < xs.size(); ++i) { + XlaOp t = Square(xs[i] / w); + out = i == 0 ? t : xla::Add(out, t); + } + return Select(Eq(w, ZerosLike(w)), ZerosLike(w), w * Sqrt(out)); auto mu = Norm({Real(alpha), Imag(alpha), Sqrt(sigma)});
157	Tensorflow/Keras	ff6601a943db5f71fda09210e67ba8e9fd839ae8	C++	fix	overflow		overflow	CombinedNonMaxSuppression greedily selects a subset of bounding boxes in descending order of score		data processing	image processing	non_max_suppression	increase variable precision/change variable type	change variable type	Use an int32 scalar as the default type for representing the maximum number of boxes retained over all classes. If int32 should overflow, use int64		max_total_size = ops.convert_to_tensor(max_total_size)	class CombinedNonMaxSuppressionTest(test_util.TensorFlowTestCase): + + # NOTE(b/142795960): parameterized tests do not work well with tf.tensor + # inputs. Due to failures, creating another test `testInvalidTensorInput` + # which is identical to this one except that the input here is a scalar as + # opposed to a tensor. + def testInvalidPyInput(self): + boxes_np = [[[[0, 0, 1, 1], [0, 0.1, 1, 1.1], [0, -0.1, 1, 0.9], + [0, 10, 1, 11], [0, 10.1, 1, 11.1], [0, 100, 1, 101]]]] + scores_np = [[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]] + max_output_size_per_class = 5 + max_total_size = 231 + with self.assertRaisesRegex( + (TypeError, ValueError), + "type int64 that does not match expected type of int32\|" + "Tensor conversion requested dtype int32 for Tensor with dtype int64"): + image_ops.combined_non_max_suppression( + boxes=boxes_np, + scores=scores_np, + max_output_size_per_class=max_output_size_per_class, + max_total_size=max_total_size) + + # NOTE(b/142795960): parameterized tests do not work well with tf.tensor + # inputs. Due to failures, creating another this test which is identical to + # `testInvalidPyInput` except that the input is a tensor here as opposed + # to a scalar. + def testInvalidTensorInput(self): + boxes_np = [[[[0, 0, 1, 1], [0, 0.1, 1, 1.1], [0, -0.1, 1, 0.9], + [0, 10, 1, 11], [0, 10.1, 1, 11.1], [0, 100, 1, 101]]]] + scores_np = [[[0.9, 0.75, 0.6, 0.95, 0.5, 0.3]]] + max_output_size_per_class = 5 + max_total_size = ops.convert_to_tensor(231) + with self.assertRaisesRegex( + (TypeError, ValueError), + "type int64 that does not match expected type of int32\|" + "Tensor conversion requested dtype int32 for Tensor with dtype int64"): + image_ops.combined_non_max_suppression( + boxes=boxes_np, + scores=scores_np, + max_output_size_per_class=max_output_size_per_class, + max_total_size=max_total_size)
158	Tensorflow/Keras	94b6db8cc538408cc29d88be13307f9fd8a77120	C++	fix	overflow		overflow	Dynamic stitch interleaves the values from the data tensors into a single tensor.	slice_size must not be stored as int for cases of tensors over 2GB.	data processing	data	dynamic_stitch	increase variable precision/change variable type	change variable type	use auto type instead of int	const int slice_size = merged_flat.dimension(1);	const auto slice_size = merged_flat.dimension(1);
159	Tensorflow/Keras	087859fce9409991164f727735743da4cb310fd4	C++	fix	overflow		overflow		large input size	other	computational graph	bilinear operation, computational graph optimization	increase variable precision/change variable type	change variable type	use int64 instead of int	const int output_elements = CalculateTensorElementCount(	const int64 output_elements = CalculateTensorElementCount	// Cost with very large tensor. + op_context.op_info.clear_outputs(); + // Number of elements in tensor exceeds 2^32. + constexpr int64 kLargeOutputImageDim = 40000; + DescribeTensor4D(1, kLargeOutputImageDim, kLargeOutputImageDim, + kChannelSize, op_context.op_info.add_outputs()); + const int64 kInterpWeightCost = 12; + // Using half_pixel_centers. + AttrValue half_pixel_centers; + half_pixel_centers.set_b(true); + (op_context.op_info.mutable_attr())["half_pixel_centers"] = + half_pixel_centers; + + const int64 num_ops = + kInterpWeightCost (kLargeOutputImageDim * 2) + + kComputeLerpCost * + (kLargeOutputImageDim * kLargeOutputImageDim * kChannelSize); + const int64 expected_compute_time = std::ceil( + num_ops / + estimator_.GetDeviceInfo(op_context.op_info.device()).gigaops); + + const int64 expected_memory_time = + (kImageDim * kImageDim + kLargeOutputImageDim * kLargeOutputImageDim) * + 4; + + const auto cost = PredictCosts(op_context); + EXPECT_EQ(cost.compute_time, Costs::Duration(expected_compute_time)); + EXPECT_EQ(cost.memory_time, Costs::Duration(expected_memory_time)); + EXPECT_EQ(cost.execution_time, + Costs::Duration(expected_memory_time + expected_compute_time)); + EXPECT_FALSE(cost.inaccurate); + EXPECT_EQ(cost.num_ops_with_unknown_shapes, 0); + }
160	Tensorflow/Keras	90e89339a9bf04fb304129a01ca50f25fdde441d	C++	fix	overflow		overflow		potential overflow in 64-bit MultiplyByQuantizedMultiplier function	quantization	quantization	quantization	use a different algorithm	use a different algorithm		int32_t reduced_multiplier = (quantized_multiplier + (1 << 15)) >> 16;	int32_t reduced_multiplier = (quantized_multiplier < 0x7FFF0000) + ? ((quantized_multiplier + (1 << 15)) >> 16) + : 0x7FFF;
161	Tensorflow/Keras	dffb0b56192f4c95fbf563a82742b4a3f4881e05	C++	fix	overflow		overflow		A U16 of 46977 multiplied by a U16 of 53826, when evaluated in the evaluator, results in the operands of the multiply getting promoted to the C++ type "int" which is signed. The result of the multiply will overflow a signed int and give a negative result.	compiler	compiler	compiler, XLA HLO (high level operations)	increase variable precision/change variable type	change variable type	promote both operands to "unsigned int" which will not suffer from any overflow issues	typename std::enable_if<std::is_integral<T>::value && - std::is_signed<T>::value>::type* = nullptr> -typename std::make_unsigned<T>::type ToArithmeticSafeType(T t) { - return static_cast<typename std::make_unsigned<T>::type>(t); - typename std::enable_if<!std::is_integral<T>::value \|\| - !std::is_signed<T>::value>::type* = nullptr>	namespace detail { +template <typename T> +using unsigned_promoted_type_t = + std::make_unsigned_t<decltype(std::declval<T>() + std::declval<T>())>; +} + typename std::enable_if<std::is_integral<T>::value>::type* = nullptr> +detail::unsigned_promoted_type_t<T> ToArithmeticSafeType(T t) { + return static_cast<detail::unsigned_promoted_type_t<T>>(t); + typename std::enable_if<!std::is_integral<T>::value>::type* = nullptr>
162	Tensorflow/Keras	90b80fba1ade0222713b8a33af00858190532075	C++	fix	overflow		overflow		compiler message overflow	compiler	compiler	compiler, XLA HLO (high level operations)	limit input range	limit input range	limit max inuts	SummarizeNodeDef(node_def), ".\n"); string SummarizeNodeDef(const NodeDef& node_def) {	SummarizeNodeDef(node_def, /max_inputs_in_summary=/10), ".\n"); +string SummarizeNodeDef(const NodeDef& node_def, int max_inputs_in_summary) { + if (max_inputs_in_summary-- == 0) { + strings::StrAppend(&ret, "..."); + break; +// The parameter `max_inputs_in_summary` specifies how many inputs at most to +// serialize in the output (in order not to get a string which is overly large). +// The value `-1` specifies that all inputs will be shown. +string SummarizeNodeDef(const NodeDef& node_def, + int max_inputs_in_summary = -1);
163	Tensorflow/Keras	036b75a818493a30cd25caef1761931a3bc2b074	C++	fix	overflow		overflow			compiler	compiler	compiler	increase variable precision/change variable type	increase variable precision	increase precision of index from int to int64	int linear_index = j * vector_size + i;	int64 linear_index = j * vector_size + i;
164	Tensorflow/Keras	2adf1114d4dc7ca30e5117acd2dc7aeb3279feb7	C++	unit test	overflow		overflow	The Android Neural Networks API (NNAPI) is available on all Android devices running Android 8.1 (API level 27) or higher. It provides acceleration for TensorFlow Lite models on Android devices with supported hardware accelerators including: Graphics Processing Unit (GPU) Digital Signal Processor (DSP) Neural Processing Unit (NPU)		other	other	NNAPI delegate	add overflow check	add overflow check	add overflow check of cpu	// reference CPU path. - Expect(is_accelerator_specified \|\| - (builtin->filter_width * builtin->filter_height <= 256), - NNAPIValidationFailureType::kUnsupportedOperandSize, - "Large filter window would overflow on the reference CPU path", - &val_ctx);	// quantized reference CPU path. + if (IsQuantized(context->tensors[node->inputs->data[0]].type)) { + Expect(is_accelerator_specified \|\| + (builtin->filter_width * builtin->filter_height <= 256), + NNAPIValidationFailureType::kUnsupportedOperandSize, + "Large filter window would overflow on the reference CPU path", + &val_ctx); + }
165	Tensorflow/Keras	85f10eb4200b3b3339340943b288da157e9742e7	C++	unit test	overflow		overflow		Compilers are producing different code and resulting in bad assumptions.	precision tests/speed benchmarks	overflow test	overflow test	increase variable precision/change variable type	change variable type	change type of variable y from auto to int64	- for (auto x : interesting) { - for (auto y : interesting) { - long double dxy = static_cast<long double>(x) * y; - if (dxy > std::numeric_limits<int64>::max()) { - EXPECT_LT(xy, 0);	bool HasOverflow(int64 x, int64 y) { +#ifdef PLATFORM_WINDOWS + // `long double` on MSVC is 64 bits not 80 bits - use a windows specific API + // for this test. + return ::MultiplyHigh(x, y) != 0; +#else + long double dxy = static_cast<long double>(x) * static_cast<long double>(y); + return dxy > std::numeric_limits<int64>::max(); +#endif +} + for (int64 x : interesting) { + for (int64 y : interesting) { if (HasOverflow(x, y)) { + EXPECT_LT(xy, 0) << x << " " << y;
166	Tensorflow/Keras	171ba06f5e52078e0aa2112797b5a4227370bbd5	C++	unit test	overflow		overflow		Subgraphs are the part of main graph and are themselves computational graphs by nature.	other	computational graph	tensorflow subgraph graph generation	add overflow check	add overflow check	bring back overflow detection for windows
167	Tensorflow/Keras	2522ce7dd5d28c9733824a66133fc918290e3ed0	C++	fix	overflow		overflow			data processing	tensor allocation	tensor allocation	add overflow check	add overflow check	Check for overflow in # of bytes computation of tensor allocation. Check both for product of shape dimensions (# of elements) and number of bytes (elements * sizeof(data_type)).	no overflow check	TfLiteStatus MultiplyAndCheckOverflow(size_t a, size_t b, size_t* product) { + constexpr size_t overflow_threshold = (8 * sizeof(size_t)) >> 1; + product = a b; + // If neither integers have non-zero bits past 32 bits can't overflow. + // Otherwise check using slow devision. + if (__builtin_expect((a \| b) >> overflow_threshold != 0, false)) { + if (a != 0 && *product / a != b) return kTfLiteError; + } + return kTfLiteOk; + for (int k = 0; k < dims_size; k++) { + size_t old_count = count; + TF_LITE_ENSURE_MSG( + &context_, + MultiplyAndCheckOverflow(old_count, dims[k], &count) == kTfLiteOk, + "BytesRequired number of elements overflowed.\n"); + } size_t type_size = 0; TF_LITE_ENSURE_OK(&context_, GetSizeOfType(&context_, type, &type_size)); + TF_LITE_ENSURE_MSG( + &context_, MultiplyAndCheckOverflow(type_size, count, bytes) == kTfLiteOk, + "BytesRequired number of bytes overflowed.\n");	TEST(BasicInterpreter, TestOverflow) { + TestErrorReporter reporter; + Interpreter interpreter(&reporter); + TfLiteQuantizationParams quantized; + + ASSERT_EQ(interpreter.AddTensors(1), kTfLiteOk); + ASSERT_EQ(interpreter.SetInputs({0}), kTfLiteOk); + ASSERT_EQ(interpreter.SetOutputs({0}), kTfLiteOk); + // Overflow testing is pointer word size dependent. + if (sizeof(size_t) == 8) { + // #bits for bytecount = 30 + 30 + 2 = 62 < 64 + ASSERT_EQ(interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 30, 1 << 30}, quantized), + kTfLiteOk); + // #bits for element count = 30 + 30 + 2 = 62 < 64 (no overflow) + // #bits for byte count = 30 + 30 + 2 + 2 = 64 == 64 (overflow) + ASSERT_NE( + interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 30, 1 << 30, 1 << 2}, quantized), + kTfLiteOk); + EXPECT_THAT( + reporter.error_messages(), + testing::EndsWith("BytesRequired number of bytes overflowed.\n")); + // #bits for element count = 30 + 30 + 2 + 4 = 66 > 64 (overflow). + // #bits for byte count = 30 + 30 + 2 + 4 + 2 = 68 > 64 (overflow). + reporter.Reset(); + ASSERT_NE(interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 30, 1 << 30, 1 << 2, 1 << 4}, + quantized), + kTfLiteOk); + EXPECT_THAT( + reporter.error_messages(), + testing::EndsWith("BytesRequired number of elements overflowed.\n")); + + } else if (sizeof(size_t) == 4) { + // #bits for bytecount = 14 + 14 + 2 = 30 < 32 + ASSERT_EQ(interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 14, 1 << 14}, quantized), + kTfLiteOk); + // #bits for element count = 14 + 14 + 3 = 31 < 32 (no overflow). + // #bits for byte count = 14 + 14 + 3 + 2 = 33 > 32 (overflow). + ASSERT_NE( + interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 14, 1 << 14, 1 << 3}, quantized), + kTfLiteOk); + EXPECT_THAT( + reporter.error_messages(), + testing::EndsWith("BytesRequired number of bytes overflowed.\n")); + // #bits for element count = 14 + 14 + 4 = 32 == 32 (overflow). + // byte count also overflows, but we don't get to that check. + reporter.Reset(); + ASSERT_NE( + interpreter.SetTensorParametersReadWrite( + 0, kTfLiteFloat32, "in1", {1 << 14, 1 << 14, 1 << 4}, quantized), + kTfLiteOk); + EXPECT_THAT( + reporter.error_messages(), + testing::EndsWith("BytesRequired number of elements overflowed.\n")); + } else { + // This test failing means that we are using a non 32/64 bit architecture. + ASSERT_TRUE(false); + } +}
168	Tensorflow/Keras	75e5b5d70b6f33bd41fdf07b844c762b23f99d1b	C++	fix	overflow		overflow		overflows in accumulation results	tensor math	tensor math	summation	increase variable precision/change variable type	increase variable precision	upcastto an integer type with more bits	N/A	// Upcast small integer types to 32 bit to avoid overflow. + if (dtype == DT_INT8 \|\| dtype == DT_INT16) { + return DT_INT32; + } + if (dtype == DT_UINT8 \|\| dtype == DT_UINT16) { + return DT_UINT32; + }	N/A
169	Tensorflow/Keras	23fde233bf3210759b5a4453bc39101df9c86d0c	C++	fix	overflow		overflow			tensor math	statistics	mean	increase variable precision/change variable type	increase variable precision	Perform mean reductions for integer types in 64 bit to mitigate overflow in the sum and/or denominator. I.e.: Upcast int8, int16, int32 into int64		define CASTING_SPECIALIZATION(ScalarType, IntermediateType) \ + template <typename Device, typename OUT_T, typename IN_T, \ + typename ReductionAxes> \ + struct ReduceEigenImpl<Device, OUT_T, IN_T, ReductionAxes, \ + functor::MeanReducer<ScalarType>> { \ + void operator()(const Device& d, OUT_T out, IN_T in, \ + const ReductionAxes& reduction_axes, \ + const functor::MeanReducer<ScalarType>& reducer) { \ + static_assert(std::is_same<ScalarType, typename OUT_T::Scalar>::value, \ + ""); \ + Eigen::internal::SumReducer<IntermediateType> sum_reducer; \ + out.device(d) = (in.template cast<IntermediateType>().reduce( \ + reduction_axes, sum_reducer) / \ + static_cast<IntermediateType>(in.size() / out.size())) \ + .template cast<ScalarType>(); \ + } \ + }	# This tests the issue reported in b/145030710. + @test_util.run_deprecated_v1 + def testSizeOverflowUint8(self): + np_arr = self._makeRandom((28,), dtypes.uint8) + self._compareAllAxes(np_arr) + + @test_util.run_deprecated_v1 + def testSizeOverflowInt8(self): + np_arr = self._makeRandom((27,), dtypes.int8) + self._compareAllAxes(np_arr) + + @test_util.run_deprecated_v1 + def testSizeOverflowUint16(self): + np_arr = self._makeRandom((216,), dtypes.uint16) + self._compareAllAxes(np_arr) + + @test_util.run_deprecated_v1 + def testSizeOverflowInt16(self): + np_arr = self._makeRandom((215,), dtypes.int16) + self._compareAllAxes(np_arr)
170	Tensorflow/Keras	79605069321520bd8af318eef92b71070dcc8961	C++	fix	overflow		overflow	strided_slice extracts a strided slice of a tensor (generalized Python array indexing).	strided_slice would overflow for end and start slices larger than int16	other	other	strided slice kernel	increase variable precision/change variable type	increase variable precision	change the StridedSliceParams start_indices and end_indices from int16 to int32 values	struct StridedSliceParams { int8 start_indices_count; - int16 start_indices[4]; int8 stop_indices_count; - int16 stop_indices[4]; int8 strides_count; - int16 strides[4];	struct StridedSliceParams { int8 start_indices_count; + int32 start_indices[4]; int8 stop_indices_count; + int32 stop_indices[4]; int8 strides_count; + int32 strides[4];	TEST(StridedSliceOpTest, In1D_Int32End) { + StridedSliceOpModel<> m({32768}, {1}, {1}, {1}, 0, 0, 0, 0, 0); + std::vector<float> values; + for (int i = 0; i < 32768; i++) { + values.push_back(i); + } + m.SetInput(values); + m.SetBegin({0}); + m.SetEnd({32768}); + m.SetStrides({1}); + m.Invoke(); + EXPECT_THAT(m.GetOutputShape(), ElementsAreArray({32768})); + EXPECT_THAT(m.GetOutput(), ElementsAreArray(values));
171	Tensorflow/Keras	eaea3db3be4e27464a0b669bebffe46f2f8b005f	C++	fix	overflow		overflow		overflow in quantization if there is a mismatch in scale of weights and biases	quantization	quantization	quantization	limit input range	limit input range	Adjusts the scale of the weight tensor when the scale is small enough to lead to overflow due to a mismatch with the scale of the bias values. Checks that the bias is quantized to within the middle half of the allowable bit range determined by the scales of the input and weight tensors If this condition is not satisfied, the scale of the weights is increased in order to prevent overflow.		TfLiteStatus AdjustWeightsForBiasScale(QuantizationParametersT* quant_params, + const float* bias_data, + const size_t bias_size, + const float input_scale, + ErrorReporter* error_reporter) { + // TODO(dmolitor) Allow adjusting activation scale. + // TODO(dmolitor) Tighten scale adjustment. + // TODO(dmolitor) Test using a separate strategy for scales of 0. + const int32_t kScale = std::numeric_limits<int32_t>::max(); + if (quant_params == nullptr) { + error_reporter->Report("Missing max and min values for weight tensor."); + return kTfLiteError; + } + // channel_dim_size is calculated from min.size() to infer whether + // quantization is per axis + int channel_dim_size = quant_params->min.size(); + if (channel_dim_size == 0) { + error_reporter->Report( + "Missing weight scales. Unable to check compatibility with bias " + "scale."); + return kTfLiteError; + } + + std::vector<float> weight_scales(channel_dim_size); + TF_LITE_ENSURE_STATUS(GetSymmetricScalesFromMaxMin( + quant_params, &weight_scales, error_reporter)); + + // Per channel quantization + if (channel_dim_size > 1) { + for (size_t i = 0; i < channel_dim_size; ++i) { + // Current scale is not compatible with bias. Adjust max/min values. + if (std::abs(bias_data[i]) >= + 0.5 * input_scale * weight_scales[i] * kScale) { + quant_params->max[i] = 2.0 * std::abs(bias_data[i]) / kScale * + (kMaxQuantizedValue / input_scale); + quant_params->min[i] = -quant_params->max[i]; + } + } + // Per layer quantization + } else if (channel_dim_size == 1) { + const auto minmax = std::minmax_element(bias_data, bias_data + bias_size); + const float bias_half_range = + std::max(std::abs(minmax.first), std::abs(minmax.second)); + + // Need to adjust weight min/max; not compatible with bias. + if (bias_half_range / kScale >= 0.5 * input_scale * weight_scales[0]) { + quant_params->min[0] = + 2.0 * bias_half_range / kScale * (kMinQuantizedValue / input_scale); + quant_params->max[0] = + 2.0 * bias_half_range / kScale * (kMaxQuantizedValue / input_scale); + } + } + return kTfLiteOk;
172	Tensorflow/Keras	676bce388aba376a4e6f7307dc92fdc0a8b3af42	C++	fix	overflow		overflow		Quantized mean and sum have a risk of overflow	quantization	quantization	quantization	limit input range	limit input range	cast input based on numeric limits	// Convert to float value. - output_data[idx] = - static_cast<T>(std::round(float_mean * scale + bias)) + - output_zero_point;	float result = + std::min(std::round(float_mean * scale + bias) + output_zero_point, + static_cast<float>(std::numeric_limits<T>::max())); + result = + std::max(result, static_cast<float>(std::numeric_limits<T>::min())); + output_data[idx] = static_cast<T>(result);	N/A
173	Tensorflow/Keras	e08474a981b87a8c4fdc9d9d08765727fe8d629e	C++	fix	overflow		overflow			compiler	compiler	compiler, variable accessor	increase variable precision/change variable type	increase variable precision	Change shared variables to high precision
174	Tensorflow/Keras	c782a538b0b90d93c6070ac177cb1f542272bcce	C++	fix	overflow		overflow		overflowing of integer "+" and "-" operations	CNN operations	convolution	convolution transpose	rewrite math formula	rewrite math formula		- int i = y * $kernel_size.x$ + x; - ivec2 idx = gid.xy + ivec2(x, y) - $padding$;	int i = int(float(y * $kernel_size.x$) + float(x)); + ivec2 idx = ivec2(vec2(gid.xy + ivec2(x, y)) - vec2($padding$));
175	Tensorflow/Keras	ea316ec1827bacae811858a7f681dfac47ef7f47	C++	fix	overflow		overflow		signed overflow	compiler	compiler	compiler, dot interpreter	increase variable precision/change variable type	change variable type	change type to unsigned	- static_cast<ElementwiseT>(lhs_literal.Get<ReturnT>(lhs_index)) * - static_cast<ElementwiseT>(rhs_literal.Get<ReturnT>(rhs_index));	ElementwiseT lhs_val(lhs_literal.Get<ReturnT>(lhs_index)); + ElementwiseT rhs_val(rhs_literal.Get<ReturnT>(rhs_index)); ToArithmeticSafeType(lhs_val) * ToArithmeticSafeType(rhs_val);
176	Tensorflow/Keras	09b8ed34f47dbd6921304f2d4ceb3669c1e089e6	Python	fix	overflow		overflow		int32 overflow	other	other	flatten layer	increase variable precision/change variable type	increase variable precision		increase precision of variable shape to int64 if neccessary, otherwise keep it as int32	input_shape = inputs.shape + if input_shape[1:].is_fully_defined(): + flattened_dim = tensor_shape.dimension_value( + np.prod(input_shape[1:], dtype=int)) + # Temporary fix for integer overflow issue. + if flattened_dim > np.iinfo(np.int32).max: + shape_dtype = dtypes.int64 + else: + shape_dtype = dtypes.int32 + outputs = array_ops.reshape( + inputs, constant_op.constant((-1, flattened_dim), shape_dtype))	def testFlattenLargeDim(self): + x = array_ops.placeholder(shape=(None, 21316, 21316, 80), dtype='float32') + y = core_layers.Flatten()(x) + self.assertEqual(y.shape.as_list(), [None, 21316 * 21316 * 80])
177	Tensorflow/Keras	dbcb2a5470e40974924cebd0f74d7f117b21bf8e	C++	fix	overflow		overflow			compiler	compiler	compiler, bit cast operation	increase variable precision/change variable type	increase variable precision	increase precision of an integer to int64	auto output_bit_width_mask = (1 << output_bit_width) - 1;	auto output_bit_width_mask = (int64(1) << output_bit_width) - 1;
178	Tensorflow/Keras	0d6095963d907e0de1d635842d8ed80759a436ba	C++	fix	overflow		overflow			data processing	memory allocator	ruy allocator, size	increase variable precision/change variable type	change variable type	change from std::size_t to std::ptrdiff_t
179	Tensorflow/Keras	5b4fe5470852d1aea737b194e03727cdedddebca	C++	fix	underflow		underflow		exponent smaller than -31 causes underflow	quantization	quantization	quantization	rewrite math formula	rewrite math formula	For exponents smaller than -31, set shift to zero	void GuardedQuantizeMultiplier(double effective_output_scale, - int32_t* significand, int* shift) { - QuantizeMultiplier(effective_output_scale, significand, shift); - // Additional guard to make sure RoundingDivideByPOT does not fail. - if (shift < -31) { - // If shift is less than -31, RoundingDivideByPOT fails. This happens when - // min and max are close and small. For this particular case, both - // significand and shift are set to zero. - significand = 0; - *shift = 0; - } -}	void QuantizeMultiplier(double double_multiplier, int32_t* quantized_multiplier, ++shift; } if (shift < -31) { + *shift = 0; + q_fixed = 0; + } QuantizeMultiplier(effective_output_scale, &significand, &shift);	TEST(QuantizationUtilTest, QuantizeMultiplierUnderflow) { + auto quantize = [](double d) { + int32_t q; + int s; + QuantizeMultiplier(d, &q, &s); + return std::pair<int32_t, int>{q, s}; + }; + + EXPECT_THAT(quantize(std::ldexp(1.0f, -31)), Pair(1073741824, -30)); + EXPECT_THAT(quantize(std::ldexp(1.0f, -32)), Pair(1073741824, -31)); + EXPECT_THAT(quantize(std::ldexp(0.99f, -32)), Pair(0, 0)); + EXPECT_THAT(quantize(std::ldexp(1.0f, -33)), Pair(0, 0)); +}
180	Tensorflow/Keras	3af3959377d54414f480d617402274f7e9440316	C++	fix	overflow		overflow		Using sqrt(a^2 + b^2)	tensor math	tensor math	absolute value of a complex number	rewrite math formula	rewrite math formula	use sqrt(a^2 + b^2) = sqrt(a^2 * (1 + b^2/a^2)) = \|a\| * sqrt(1 + (b/a)^2) With the assumption that \|a\| >= \|b\|	case HloOpcode::kAbs: { - auto sum_sq = FAdd( - FMul(EmitExtractReal(operand_value), EmitExtractReal(operand_value)), - FMul(EmitExtractImag(operand_value), EmitExtractImag(operand_value))); - return llvm_ir::EmitCallToIntrinsic(llvm::Intrinsic::sqrt, {sum_sq},	StatusOr<llvm::Value> ElementalIrEmitter::EmitComplexAbs( + PrimitiveType prim_type, llvm::Value operand_value) { + auto real = EmitExtractReal(operand_value); + auto imag = EmitExtractImag(operand_value); + auto abs_real = llvm_ir::EmitCallToIntrinsic(llvm::Intrinsic::fabs, {real}, + {real->getType()}, b_); + auto abs_imag = llvm_ir::EmitCallToIntrinsic(llvm::Intrinsic::fabs, {imag}, + {imag->getType()}, b_); + auto max = EmitFloatMax(abs_real, abs_imag); + auto min = EmitFloatMin(abs_real, abs_imag); + + auto div = FDiv(min, max); + auto div_sq = FMul(div, div); + auto one = llvm::ConstantFP::get(max->getType(), 1); + TF_ASSIGN_OR_RETURN(auto sqrt, EmitSqrt(prim_type, FAdd(one, div_sq))); + + auto zero = llvm::ConstantFP::get(max->getType(), 0); + return Select(FCmpOEQ(max, zero), zero, FMul(max, sqrt));
181	Tensorflow/Keras	840f25bd4623e5a9aedcbe6163332f51ee303784	C++	fix	overflow		overflow		signed integer overflow in HandleCopies when batch_size * indices_size * slice_size is larger than int32	data processing	parallelism	kernels, gather	increase variable precision/change variable type	increase variable precision	use int64 instead of int32 for large values		bool use_large = + batch_size * indices_size * slice_size > + std::numeric_limits<int32>::max());
182	Tensorflow/Keras	8211365f9e8aed8cec7b63d7eb992ab104422f8c	C++, Python	fix	overflow		overflow		build error on Windows caused by potential int32 overflow	data processing	data	shard size	increase variable precision/change variable type	increase variable precision	increase precision from int32 to int64 to calculate the default shard size	self._shard_size_bytes = ( - shard_size_bytes - if shard_size_bytes is not None else 10 * 1024 * 1024 * 1024) self._pending_snapshot_expiry_seconds = ( pending_snapshot_expiry_seconds - if pending_snapshot_expiry_seconds is not None else 86400)	// Defaults to 10 GiB per shard. +const int64 kDefaultShardSizeBytes = 10L * 1024 * 1024 * 1024; if (shard_size_bytes_ == -1) shard_size_bytes_ = kDefaultShardSizeBytes; + + // Default to 1 day expiry for snapshots. + if (pending_snapshot_expiry_seconds_ == -1) { + pending_snapshot_expiry_seconds_ = 86400; + } shard_size_bytes if shard_size_bytes is not None else -1)
183	Tensorflow/Keras	8ac1075eac1ab9072e29c025348f749b43f251cf	C++	fix	overflow		overflow			optimizers	optimizers	experimental optimizer	limit input range	limit input range	add upper bound scaling to prevent overflow	element = T(UniformDistribution(RandomType(0), RandomType(1), &gen));	auto upper_bound = + RandomType(std::is_same<T, Eigen::half>::value ? 0.1 : 1.0); + element = T(UniformDistribution(RandomType(0), upper_bound, &gen));
184	Tensorflow/Keras	737600454df83be02fac46e48b093a8892c7241a	C++	unit test	overflow		overflow			linear algebra	linear algebra	matrix multiply	limit input range	limit input range	Avoid the corner case where both lhs and rhs zero_point's are the lowest representable value in their respective quantized type. E.g. when both LHS and RHS are uint8 with zero_point=0.	if (!use_golden && !std::is_floating_point<LhsScalar>::value) { - lhs_params.zero_point = random_engine() % 8; if (!use_golden && !std::is_floating_point<RhsScalar>::value) { - rhs_params.zero_point = random_engine() % 8;	if (!std::is_floating_point<LhsScalar>::value) { + lhs_params.zero_point = 1; + if (!use_golden) { + lhs_params.zero_point += random_engine() % 8; + } if (!std::is_floating_point<RhsScalar>::value) { + rhs_params.zero_point = 1; + if (!use_golden) { + rhs_params.zero_point += random_engine() % 8; + }
185	Tensorflow/Keras	c38b41d7c813e0dc26fa99cf6495ec474a595542	C++	fix	overflow		overflow		possible float-to-integer-cast overflow	precision tests/speed benchmarks	timing	timing	increase variable precision/change variable type	change variable type	cahnge variable for holding processing time from int64 to double	const int64 processing_time = TotalProcessingTime(snapshot); const int64 output_time = OutputTime(snapshot); int64 best_delta = -1; int64 new_output_time = OutputTime(snapshot); int64 delta = output_time - new_output_time; int64 Model::OutputTime(std::shared_ptr<Node> node)	const double processing_time = TotalProcessingTime(snapshot); const double output_time = OutputTime(snapshot); double best_delta = -1.0L; double new_output_time = OutputTime(snapshot); double delta = output_time - new_output_time; double Model::OutputTime(std::shared_ptr<Node> node) {
186	Tensorflow/Keras	52a6cfddef9b51b608b4a554b77a10e1522d56ec	C++	fix	overflow		overflow		overflow of variable size	data processing	parallelism	segmented reduction—a parallel reduction over many irregular-length segments.	increase variable precision/change variable type	increase variable precision	change int to int64 for number of threads, block size, block numbers, compute cycles, compute bytes	const int num_threads = cpu_device.numThreads(); const int min_block_size = 64; - const int max_block_num = std::min(N / min_block_size + 1, num_reductions); - int block_num = std::min(max_block_num, num_threads); - const int block_size = N / block_num; const int compute_cycles = 5 * (N - num_reductions) * inner_dim; - const int output_bytes = num_reductions * inner_dim * sizeof(T);	const int64 num_threads = cpu_device.numThreads(); const int64 min_block_size = 64; + int64 block_num = std::min(num_reductions, num_threads); + int64 block_size = (N - 1) / block_num + 1; + if (block_size < min_block_size) { + block_size = min_block_size; + block_num = (N - 1) / min_block_size + 1; const int64 compute_cycles = 5 * (N - num_reductions) * inner_dim; + const int64 output_bytes = num_reductions * inner_dim * sizeof(T);
187	Tensorflow/Keras	aa4765a1417950cf2c29afd0172aebdd31b0725f	C++	fix	overflow		overflow		cast overflow undefined behavior	tensor math	tensor math	absolute value of a complex number	increase variable precision/change variable type	increase variable precision	Change this function that outputs absolute value to return a double instead of a float to avoid cast overfloat for inputs with types double and complex128.	float FpAbsoluteValue(NativeT value)	double FpAbsoluteValue(NativeT value)	TEST(LiteralTestUtilTest, ExpectNearDoubleOutsideFloatValueRange) { + auto two_times_float_max = + LiteralUtil::CreateR0<double>(2.0 * std::numeric_limits<float>::max()); + ErrorSpec error(0.001); + EXPECT_TRUE( + LiteralTestUtil::Near(two_times_float_max, two_times_float_max, error));
188	Tensorflow/Keras	d0136d4affebd14fee59ba1865d5f1c8fa64251a	C++	fix	overflow		overflow	TensorFlow BFC Allocator is a memory allocator that implements a 'best-fit with coalescing' algorithm.	index integer overflow	data processing	memory allocator	TensorFlow BFC Allocator	increase variable precision/change variable type	change variable type	change type of an index from int to size_t	int IndexFor(const void* p) const return static_cast<int>(((p_int - base_int) >> kMinAllocationBits));	size_t IndexFor(const void* p) const return static_cast<size_t>(((p_int - base_int) >> kMinAllocationBits));
189	Tensorflow/Keras	f9ac078ebd0d05b64691e6718d404ee801f80c67	C++	fix	overflow		overflow		conversion to float32 results in overflow	other	other	number casting	add overflow check	add overflow check	return error if overflow and return null pointer if infinity		double as_double = PyFloat_AsDouble(v); + // Handle infinity. + if (as_double == std::numeric_limits<double>::infinity()) { + out = std::numeric_limits<T>::infinity(); + return nullptr; + } else if (as_double == -1 std::numeric_limits<double>::infinity()) { + out = -1 std::numeric_limits<T>::infinity(); + return nullptr; + } + // Check for overflow. + if (as_double > std::numeric_limits<T>::max() \|\| + as_double < std::numeric_limits<T>::lowest()) { + return ErrorOutOfRangeDouble; + } + *out = static_cast<T>(as_double);
190	Tensorflow/Keras	c8e8f35f3e39b36d105eb7e43321a9da1362f242	C++	fix	overflow		overflow		integer overflow	precision tests/speed benchmarks	timing	timing	increase variable precision/change variable type	increase variable precision	use unsigned long long int	event->set_timestamp_ps(node.all_start_micros() * 1000000); - event->set_duration_ps(node.all_end_rel_micros() * 1000000);	static constexpr uint64 kMicrosToPicos = 1000ULL * 1000ULL; const uint64 profile_start_time_micros event->set_timestamp_ps( + (node.all_start_micros() - profile_start_time_micros) * + EnvTime::kMicrosToPicos); + event->set_duration_ps(node.all_end_rel_micros() * + EnvTime::kMicrosToPicos);
191	Tensorflow/Keras	f1d0c84f699624382c8d66e2ea10205ac0207868	C++	disable test	overflow		overflow			precision tests/speed benchmarks	timing	timing	disable test/warning	disable overflow test	Skip overflow testing when running with address sanitizer.
192	Tensorflow/Keras	84337310517914ca4b4d6eb35295a65758bc6d75	C++	unit test	overflow		overflow		AveragePool uses a uint16 accumulator which causes it to overflow for large images	CNN operations	pooling layer	average pooling	add overflow check	add overflow check	add overflow test	N/A	// Send in a white image, expect a white pixel. +TEST(QuantizedPoolingOpTest, AveragePoolImageSize16) { + int image_size = 16; + QuantizedPoolingOpModel m( + BuiltinOperator_AVERAGE_POOL_2D, + /input=/{TensorType_UINT8, {1, image_size, image_size, 1}, 0, 16}, + /filter_width=/image_size, + /filter_height=/image_size, + /output=/{TensorType_UINT8, {}, 0, 16}); + + std::vector<float> input(image_size * image_size, 16.f); + m.SetInput(input); + m.Invoke(); + + EXPECT_THAT(m.GetOutput(), ::testing::ElementsAre(255)); + EXPECT_THAT(m.GetDequantizedOutput(), ElementsAreArray(ArrayFloatNear({16}))); +} + +// Send in a white image, expect something other than a white pixel, due to +// overflow. +TEST(QuantizedPoolingOpTest, AveragePoolImageSize17) { + int image_size = 17; + QuantizedPoolingOpModel m( + BuiltinOperator_AVERAGE_POOL_2D, + /input=/{TensorType_UINT8, {1, image_size, image_size, 1}, 0, 16}, + /filter_width=/image_size, + /filter_height=/image_size, + /output=/{TensorType_UINT8, {}, 0, 16}); + + std::vector<float> input(image_size * image_size, 16.f); + m.SetInput(input); + m.Invoke(); + + // Ordinarily we would see '255' here. However, the optimized version of + // AveragePool uses a uint16 accumulator which causes it to overflow for + // images this large. + EXPECT_THAT(m.GetOutput(), ::testing::ElementsAre(28));	N/A
193	Tensorflow/Keras	434dbe38970ffc90a5b546780be702e0b5de9a0c	C++	fix	overflow	undefined behavior	overflow		undefined behavior caused by integer overflow in custom float comparison	compiler	compiler	casting	limit input range	limit input range	consider numeric limits	return std::numeric_limits<CastType>::max() - casted_value;	return static_cast<UnsignedCastType>(std::numeric_limits<CastType>::max()) - + casted_value;
194	Tensorflow/Keras	fc44600e5c3ccf1de1e3d4792a00d3578311d3f6	Cuda	fix	overflow		overflow		index integer overflow	linear algebra	linear algebra	row reduce	rewrite math formula	rewrite math formula	rewrite formula	const int row = (blockIdx.x * blockDim.x + threadIdx.x) / 32; std::size_t temp_storage_bytes = 0; - Tensor temp_storage; - // written as a loop because it reduces clutter - // first pass allocates memory, second launches kernel(s) - for (int i = 0; i < 2; ++i) { - auto success = cub::DeviceReduce::Reduce( - i == 0 ? nullptr : temp_storage.flat<int8_t>().data(), - temp_storage_bytes, in, out, in_size, op, init, cu_stream);	assert(blockDim.x % 32 == 0); + int warps_per_block = blockDim.x / 32; + int warp_index = threadIdx.x / 32; + const int row = blockIdx.x * warps_per_block + warp_index; size_t temp_storage_bytes = 0; + auto reduce = [&](void* temp_storage_ptr) { + auto success = + cub::DeviceReduce::Reduce(temp_storage_ptr, temp_storage_bytes, in, out, + in_size, op, init, cu_stream);
195	Tensorflow/Keras	e66aea59e0367618f924ffe3bc3b1140be8eaf45	C++	fix	underflow		underflow		underflow if data empty	data processing	data	tf.data / Cloud Bigtable	rewrite math formula	rewrite math formula	change order of operations	if (index_ > keys_.size() - 2) {	if (index_ + 2 > keys_.size()) {
196	Tensorflow/Keras	880390941ce6430996c8f842540f73b53f3d1d8e	Python	fix	overflow		overflow		int32 overflow	data processing	parallelism	parallelism	increase variable precision/change variable type	increase variable precision	use int64 number of segments to guard against int32 overflow	num_segments *= n	num_segments = math_ops.cast(num_segments, dtypes.int64) * math_ops.cast( + n, dtypes.int64)
197	Tensorflow/Keras	f0d7172a30954b6696bdf2f40a5be11e7fdeb39c	C++	fix	overflow		overflow		int overflow	compiler	compiler	compiler, shape inference	add overflow check	add overflow check	return invalid argument if number of features is not positive		if (feature_group_count <= 0) { + return InvalidArgument( + "feature_group_count must be a positive number, got %d", + feature_group_count);
198	Tensorflow/Keras	63bac283d12899a2d769a768729942c4f64436ea	C++	fix	overflow	undefined behavior	overflow		undefined behavior due to signed integer overflow	data processing	image processing	bmp image decoding	rewrite math formula	rewrite math formula	rewrite formula	const int row_size = (8 * channels_ * width + 31) / 32 * 4;	const int row_size = (channels_ * width + 3) / 4 * 4;
199	Tensorflow/Keras	4f7a169a7eb97ea4819217f14705d6c2bd125355	C++	fix	overflow		overflow		Need to handle overflow in devision and remainder	compiler	compiler	compiler, elemental emiter, division, remainder	add overflow check	add overflow check	Define integer division overflow for CPU/GPU	N/A	X / 0 == -1 X % 0 == X INT_SMIN / -1 = INT_SMIN INT_SMIN % -1 = 0
200	Tensorflow/Keras	d7ebc1f4ca2c677710c5257d30c757f0f8b604c6	Python	fix	overflow		overflow		overflow in flops calculations in nn_ops.py	CNN operations	CNN	flops calculation, product	increase variable precision/change variable type	increase variable precision	use int64 for product calculation	output_count = np.prod(output_shape.as_list())	output_count = np.prod(output_shape.as_list(), dtype=np.int64)
201	Tensorflow/Keras	e7674c09a151cac07bae43f6fe8551e8fec6dfe0	C++	fix	overflow		overflow		array index overflow in TransformFilter functor	CNN operations	convolution	indexing, 2D convolution	limit input range	limit input range	subtract 2 from number of dimentions to iterate over	for (int i = 0; i < NDIMS; ++i) { // spatial dimensions	for (int i = 0; i < NDIMS - 2; ++i) { // spatial dimensions
202	Tensorflow/Keras	aec5a0191e21ce022f47d743a4954e13f710cd8f	C++	fix	overflow		overflow		very large and branchy models, where the number of paths is exponential to the number of nodes can overflow - specifically an overflow in hlo_scheduling, when compiling AutoML models	compiler	compiler	compiler, HLO (high level operations)	limit input range	limit input range	set min and max for total number of HLOs	N/A	int64 total_hlos = computation.parent()->NumUniqueInstructionIds(); extra_users[hlo] = std::min(extra_users[hlo], total_hlos);
203	Tensorflow/Keras	503b7c11b44ee8b238946b345efea503058652c0	Python	disable test	overflow		overflow	SinhArcsinh: Y = g(X) = Sinh( (Arcsinh(X) + skewness) * tailweight ) * multiplier.	overflow test fails	other	transformations	square, Bijective transformations	disable test/warning	disable overflow test	Skipped the check that fails due to overflow error	# Do the numpy calculation in float128 to avoid inf/nan. - y_float128 = np.float128(y) - self.assertAllClose( - np.log(np.cosh( - np.arcsinh(y_float128) / tailweight - skewness) / np.sqrt( - y_float128**2 + 1)) - - np.log(tailweight), - bijector.inverse_log_det_jacobian(y, event_ndims=0).eval(), - rtol=1e-4, - atol=0.)	# On IBM PPC systems, longdouble (np.float128) is same as double except that it can have more precision. + # Type double being of 8 bytes, can't hold square of max of float64 (which is also 8 bytes) and + # below test fails due to overflow error giving inf. So this check avoids that error by skipping square + # calculation and corresponding assert. + + if np.amax(y) <= np.sqrt(np.finfo(np.float128).max) and \ + np.fabs(np.amin(y)) <= np.sqrt(np.fabs(np.finfo(np.float128).min)): + + # Do the numpy calculation in float128 to avoid inf/nan. + y_float128 = np.float128(y) + self.assertAllClose( + np.log(np.cosh( + np.arcsinh(y_float128) / tailweight - skewness) / np.sqrt( + y_float128**2 + 1)) - + np.log(tailweight), + bijector.inverse_log_det_jacobian(y, event_ndims=0).eval(), + rtol=1e-4, + atol=0.)
204	Tensorflow/Keras	f5dbc1e16622f433f41f195bb33f56d674a004ce	C++	fix	overflow		overflow	TensorFlow Lite Converter converts TensorFlow graphs into TensorFlow Lite graphs	overflow in shape calculation TensorFlow's shapes use int64s, while TOCO uses ints.	tensor math	tensor shape	shape, Tensorflow Lite Converter (TOCO)	add overflow check	add overflow check
205	Tensorflow/Keras	9f312f32091534bfc115212d2ec7c838180df663	C++	fix	overflow		overflow		overflow due to large values	other	random number generator	random tensor generation	limit input range	limit input range	Updating Generate Random Tensor to generate tensors whose values are small and do not cause overflow for arithmetic operations.	tensor.flat<T>() = tensor.flat<T>().random();	for (auto i = 0; i < tensor.NumElements(); i++) + tensor.flat<T>()(i) = i + random::New64() % 10;
206	Tensorflow/Keras	6a7779f3384e48012d3e27ae0f48d410f5174d06	C++	fix	overflow		overflow		undefined signed integer overflow	statistical distributions	statistical distributions	random uniform distribution	limit input range	limit input range	impose coditions on random number generation to prevent overflow	result[i] = lo_ + static_cast<int32>(sample[i] % range_); result[i] = lo_ + static_cast<int64>(bits % range_);	template <typename Int> +PHILOX_DEVICE_INLINE Int SignedAdd(Int a, + typename std::make_unsigned<Int>::type b) { + auto b_div_2 = b >> 1; + return a + static_cast<Int>(b_div_2) + static_cast<Int>(b - b_div_2); result[i] = SignedAdd(lo_, sample[i] % range_); result[i] = SignedAdd(lo_, bits % range_);
207	Tensorflow/Keras	d107fee1e4a9a4462f01564798d345802acc2aef	C++	fix	overflow		overflow			other	other	I/O	limit input range	limit input range	consider numeric limits	N/A	if (kBlockTrailerSize > std::numeric_limits<size_t>::max() - n) { + return errors::DataLoss("handle.size() too big"); + } + N249
208	Tensorflow/Keras	665a4bf664546224c65eeb5a0a52d80e48e2f3e1	C++	fix	overflow		overflow		int64 overflow and low accuracy	compiler	compiler	compiler, HLO (high level operations), size	use a different algorithm	use a different algorithm	The new implementation computes the min of the previous overestimate and the sum of all HLO's before-and-including the current HLO in a topological sort of the graph.
209	Tensorflow/Keras	11f1e50886f91ce2caa6e53b0bc9a1e82abdda8e	Python	unit test	overflow		overflow		exp() test overflowing	tensor math	tensor math	exponential	limit input range	limit input range	Keep the results below 2^31 in exp(), consider min and max	create_tensor_data(parameters["input_dtype"], parameters["input_shape"])	create_tensor_data(parameters["input_dtype"], parameters["input_shape"], + min_value=-100, max_value=9)
210	Tensorflow/Keras	49f73c55d56edffebde4bca4a407ad69c1cae433	C++	fix	overflow		overflow		integer overflow	data processing	image processing	bmp image decoding	increase variable precision/change variable type	increase variable precision	Fix integer overflow in BMP decoder by making the checks in DecodeBmp more stringent. Total possible pixel bytes must be less than 2^30. Also, increase orecision of image size from int to int64. Add fuzzer to improve the robustness of the decoder in the future.	const int last_pixel_offset = - header_size + (abs(height) - 1) * row_size + (width - 1) * channels_; - const int expected_file_size = last_pixel_offset + channels_;	OP_REQUIRES(context, width > 0 && header_size >= 0, + errors::InvalidArgument("Width must be positive")); + OP_REQUIRES(context, header_size >= 0, + errors::InvalidArgument("header size must be nonnegative")); + + // The real requirement is < 2^31 minus some headers and channel data, + // so rounding down to something that's still ridiculously big. + OP_REQUIRES( + context, + (static_cast<int64>(width) * std::abs(static_cast<int64>(height))) < + static_cast<int64>(std::numeric_limits<int32_t>::max() / 8), + errors::InvalidArgument( + "Total possible pixel bytes must be less than 2^30")); + + const int32 abs_height = abs(height); const int64 last_pixel_offset = static_cast<int64>(header_size) + + (abs_height - 1) * row_size + + (width - 1) * channels_; const int64 expected_file_size = last_pixel_offset + channels_;
211	Tensorflow/Keras	7f88363810e77a39db919fb4000583ad0138e53c	C++	fix	overflow		overflow		integer overflow	other	computational graph	shape size propagation in a tf graph	increase variable precision/change variable type	increase variable precision	increase precision from int to int64 for max loops	const int num_loops = new_shapes->size(); - const int max_loop_length = item_.graph.node_size(); - const int max_rank = 4; - const int max_loop_iterations = - max_rank * max_loop_length * std::max(1, num_loops * num_loops); - const int num_queues = resources.size(); - const int max_resource_iterations = num_queues * num_queues * max_rank; - - int num_resource_iterations = 0; int num_loop_iterations = 0;	const int64 num_loops = new_shapes->size(); + const int64 max_loop_length = item_.graph.node_size(); + const int64 max_rank = 4; + const int64 max_loop_iterations = + max_rank * max_loop_length * std::max<int64>(1, num_loops * num_loops); + const int64 num_queues = resources.size(); + const int64 max_resource_iterations = num_queues * num_queues * max_rank; + + int64 num_resource_iterations = 0; do { + int64 num_loop_iterations = 0;
212	Tensorflow/Keras	192f1c24ec6692342391c03bb620f5de1af9de3b	C++	fix	overflow		overflow		integer overflow	data processing	parallelism	parallelism	rewrite math formula	rewrite math formula	rewrite formula for calculating maximum number of elements	- input_shape.num_elements() >= - std::max(num_threads, num_split) * 4096 && - input_shape.num_elements() < num_split * 180 * 1024); - num_split, kint64max, range_output_func); - input_shape.num_elements() >= - std::max(num_threads, num_split) * 4096 && - input_shape.num_elements() < num_split * 180 * 1024); - num_split, kint64max, range_output_func);	const auto input_element_count = input_shape.num_elements(); input_element_count >= std::max(num_threads, num_split) * 4096 && + input_element_count < num_split * 180 * 1024); num_split, input_element_count / num_split, range_output_func); input_element_count >= std::max(num_threads, num_split) * 4096 && + input_element_count < num_split * 180 * 1024); num_split, input_element_count / num_split, range_output_func);
213	Tensorflow/Keras	b1c095a28a7aa9bbee4af4d9a7e9d0c60567765b	Python	fix	underflow		underflow		underflow in log probability	statistical distributions	statistical distributions	multinomial distribution, log probabilty	use a different algorithm	use a different algorithm	use log softmax and logits instead of log and probabilities	return math_ops.reduce_sum(counts * math_ops.log(self.probs), -1)	return math_ops.reduce_sum(counts * nn_ops.log_softmax(self.logits), -1)	def testPmfUnderflow(self): + logits = np.array([[-200, 0]], dtype=np.float32) + with self.test_session(): + dist = multinomial.Multinomial(total_count=1., logits=logits) + lp = dist.log_prob([1., 0.]).eval()[0] + self.assertAllClose(-200, lp, atol=0, rtol=1e-6)
214	Tensorflow/Keras	74137f994faad09593ae2daad6251a4ccf72f558	C++	fix	overflow		overflow		When a node name has a long numeric suffix, e.g., "foo/y_0/gradient_debug_09684b60f2184c67b744721915034528" (as has happened with tfdbg GradientsDebugger), the parsing algorithm in ParseTensorName() may experience signed int overflow.	other	other	tensor name parser	increase variable precision/change variable type	change variable type	use unsigned int instead of signed int	- int index = 0; - int mul = 1;	unsigned int index = 0; + unsigned int mul = 1;
215	Tensorflow/Keras	793fa4e91d3cae77565f753c2b8d769e1a3928f8	Python	fix	overflow		overflow	vimco package proves a Bayesian variable selection method for GWAS data with multiple traits. Unlike in BVSR where each trait is analyzed seperately, vimco performs a joint analysis for the multiple traits, while accounting for correlation among the multiple traits. Csiszar f-Divergence generalized VIMCO objective	overflow issue in Csiszar-VIMCO	other	probability	gradient estimator, csiszar divergence	rewrite math formula	rewrite math formula	rewrite formula for log sum - subtract maximum from input	log_sum_u = math_ops.reduce_logsumexp(logu, axis=0) return log_sum_u - log_n, log_soosum_u - log_n	log_max_u = math_ops.reduce_max(logu, axis=0) + log_sum_u_minus_log_max_u = math_ops.reduce_logsumexp( + logu - log_max_u, axis=0) is_positive_and_largest = math_ops.logical_and( + logu > 0., + math_ops.equal(logu, log_max_u[array_ops.newaxis, ...])) + log_lomsum_u = math_ops.reduce_logsumexp( + array_ops.where(is_positive_and_largest, + array_ops.fill(array_ops.shape(logu), -inf), + logu), + axis=0, keep_dims=True) + log_lomsum_u = array_ops.tile( + log_lomsum_u, + multiples=1 + array_ops.pad([n-1], [[0, array_ops.rank(logu)-1]])) + + d_not_ok_result = array_ops.where( + is_positive_and_largest, + log_lomsum_u, + array_ops.fill(array_ops.shape(d), -inf)) + + log_loosum_u = array_ops.where(d_ok, d_ok_result, d_not_ok_result) log_avg_u = log_sum_u_minus_log_max_u + log_max_u - log_n + log_sooavg_u = log_soosum_u - log_n + + log_avg_u.set_shape(logu.shape.with_rank_at_least(1)[1:]) + log_sooavg_u.set_shape(logu.shape) + + return log_avg_u, log_sooavg_u
216	Tensorflow/Keras	d906c963269dd1522c7693c8f944e6a846b86221	C++	unit test	overflow		overflow		signed integer overflows detected with -fsanitize=signed-integer-overflow	compiler	compiler	compiler, shape inference	increase variable precision/change variable type	change variable type, add overflow check	use unsigned int instead of signed int to prevent undefined behavior and report error if overflow	const int64 b = a + 1; const int64 sum = first_value + second_value; - int64 result = 0; [](int i) { return static_cast<float>(i * i * i); });	const int64 b = a - 1; const int64 sum = static_cast<uint64>(first_value) + second_value; uint64 result = 0; [](int i) { return static_cast<float>(i) * i * i; });
217	Tensorflow/Keras	931fd84bb72df0500f512d5d92ec0bef2ea461be	Python	fix	overflow		overflow		numpy.prod overflow on windows	gradients/derivatives	gradients	shape, gradient, tensor	increase variable precision/change variable type	increase variable precision	perform computations in int64 instead of int32 and then convert result to int32	shape_size = np.prod(shape) num_elements = np.prod(shape) params_shape = array_ops.shape(params)	shape_size = np.prod(shape, dtype=np.int64) num_elements = np.prod(shape, dtype=np.int64) params_shape = array_ops.shape(params, out_type=ops.dtypes.int64) params_shape = math_ops.to_int32(params_shape)
218	Tensorflow/Keras	e8ee5286a686c6fc3057ba7cf9ba9ef7003789a6	C++	fix	overflow		overflow			data processing	data	tensor shape, multipy, size	limit input range	limit input range, add overflow check	Remove 2**40 size limit on TensorShape, use std::numerica_limits instead. the previous TensorShape code did not check for overflow when multiplying
219	Tensorflow/Keras	3c9ba5673cf560ded0739530b673ab0a05d43630	C++	unit test	overflow		overflow		integer overflow, undefined behavior, square	other	random number generator	pseudo-random number generator	increase variable precision/change variable type	increase variable precision	cast from int32 to int64	sum += Square(counts[i] - expected_count);	sum += Square(static_cast<int64>(counts[i] - expected_count));
220	Tensorflow/Keras	60e7360dfcf8951c4a269cfddd2a9cf2a05d7f91	Python	fix	overflow/underflow		overflow/underflow	Adjust the brightness of RGB or Grayscale images.	The current implementation (i.e. without clipping before conversion) introduces different behavior for images with different original data types, i.e. uint8 or float32.	data processing	image processing	images, adjust brightness	limit input range	limit input range	clip image into [0.0, 1.0] before converting back to original data type in 'adjust_brightness'	N/A	adjusted = clip_ops.clip_by_value(adjusted, 0.0, 1.0)	def testNegativeDeltaFloat(self): + x_shape = [2, 2, 3] + x_data = [0, 5, 13, 10, 135, 226, 37, 8, 245, 90, 255, 1] + x_np = np.array(x_data, dtype=np.float32).reshape(x_shape) / 255. + + y_data = [0, 0, 3, 0, 125, 216, 27, 0, 235, 80, 245, 0] + y_np = np.array(y_data, dtype=np.float32).reshape(x_shape) / 255. + + self._testBrightness(x_np, y_np, delta=-10. / 255.)
221	Tensorflow/Keras	ec58d4042790e71172964383f737b249289d15af	Python	fix	underflow		underflow			statistical distributions	statistical distributions	gumbel distribution	limit input range	limit input range	set min value with np.finfo(np_dtype).tiny
222	Tensorflow/Keras	096ab75275862f973b2fd1a369a9fd25952a6c37	C++	fix	overflow		overflow		text files larger than 2B words overflows	other	NLP	word to vec embedding, size	increase variable precision/change variable type	increase variable precision	increase precision of corpus size from int32 to int64	int32 corpus_size_ = 0;	int64 corpus_size_ = 0;
223	Tensorflow/Keras	e6e06b2fc89d41556d159d1181a558f8f5352b87	C++	fix	overflow		overflow			other	other	strings	rewrite math formula	rewrite math formula	rewrite formula for checking overflow	if (new_v < v) {	if (new_v / 8 < v) {	// (2^64-1)*10+9 + TestConsumeLeadingDigits("184467440737095516159yz", -1, + "184467440737095516159yz");
224	Tensorflow/Keras	4ad8912996a25136a280312de3801f30dd4d4a74	C++	unit test	overflow		overflow		overflow in float-int32 cast	quantization	quantization	quantization	use a different algorithm	use a different algorithm		const int values_count = sizeof(T) == 256 ? 256 : 50000; if (sizeof(T) == 256) { input_array(i) = Eigen::NumTraits<T>::lowest() + static_cast<int32>(q_range / values_count * i);	const int values_count = sizeof(T) == 1 ? 256 : 50000; if (sizeof(T) == 1) { int64 offset = static_cast<int64>(q_range / values_count * i); + input_array(i) = static_cast<int32>( + Eigen::NumTraits<T>::lowest() + + std::min<int64>(Eigen::NumTraits<T>::highest(), offset));
225	Tensorflow/Keras	f4264cb8e1ea70c612170ed72b9fe0382d1967a0	C++	fix	overflow		overflow		overflow when using float in eigen to quantize to QInt32	quantization	quantization	quantization	limit input range	limit input range	use bounds that can be converted back to int32 without going outside the range of an int32.		static float upper_bound_float() { + return Eigen::numext::mini( + static_cast<float>(Eigen::NumTraits<T>::highest()), +2.147483520e+09f); static float lower_bound_float() { + return Eigen::numext::maxi( + static_cast<float>(Eigen::NumTraits<T>::lowest()), -2.147483648e+09f);
226	Tensorflow/Keras	6047c6977dbc30f018b8b3ea0486ca907901dabb	C++	fix	overflow		overflow			data processing	data	png I/O	increase variable precision/change variable type	increase variable precision	Force height*row_bytes computations to use 64 bits.	N/A	int64 height = static_cast<int64>(height_in);
227	Tensorflow/Keras	f4686d27a705bd547b828693462714d31bfd21ce	C++	fix	overflow		overflow		static_cast overflow in WorkSharder Shard	data processing	data	shard, dataset	increase variable precision/change variable type	increase variable precision	cast intermediate variable to higher precision	const int num_shards = std::max( - 1, std::min<int>(num_workers, total * cost_per_unit / kMinCostPerShard));	const int num_shards = + std::max<int>(1, std::min(static_cast<int64>(num_workers), + total * cost_per_unit / kMinCostPerShard));	TEST(Shard, OverflowTest) { + thread::ThreadPool threads(Env::Default(), "test", 3); + mutex mu; + for (auto workers : {1, 2, 3}) { + const int64 total_elements = 1LL << 32; + const int64 cost_per_unit = 10000; + int num_shards = 0; + int64 num_elements = 0; + Shard(workers, &threads, total_elements, cost_per_unit, + [&mu, &num_shards, &num_elements](int64 start, int64 limit) { + mutex_lock l(mu); + ++num_shards; + num_elements += limit - start; + }); + EXPECT_EQ(num_shards, workers); + EXPECT_EQ(num_elements, total_elements); + } +}
228	PyTorch	e6000a7c045cbece5fbfd7d933c39e40b1625037	Python	Disable test	loss of precision			Quantize = convert from float 32 ro int 8, dequantize = convert from int 8 to float 32 During training, all calculations are done in floating point, with fake_quant modules modeling the effects of quantization by clamping and rounding to simulate the effects of INT8. After model conversion, weights and activations are quantized, and activations are fused into the preceding layer where possible. It is commonly used with CNNs and yields a higher accuracy compared to static quantization. Quantization Aware Training is also known as QAT.	test_numerical_consistency_per_tensor in test_fake_quant is failing on Windows. The test is comparing numerical consistency between CPU quantize/dequantize op and the CPU fake quantize op.	quantization	quantization	testing, quantization	disable test/warning	disable precision test	Temporarily disables a test for comparing numerical consistency between CPU quantize/dequantize op and the CPU fake quantize op
229	PyTorch	02d318461e5c7bded304c42ed7075de84f71dac6	Python	Disable test	loss of precision			Quantized operations require FBGEMM. FBGEMM (Facebook GEneral Matrix Multiplication) is a low-precision, high-performance matrix-matrix multiplications and convolution library for server-side inference. FBGEMM is only optimized for CPUs with instruction set support avx2 or newer. In Pytorch, quantization currently supports two backends: fbgemm (for use on x86,) and qnnpack (for use on the ARM)	Failing test	quantization	quantization	testing, quantization	disable test/warning	disable precision test	Temporarily disable test_numerical_consistency_per_channel due to failure
230	PyTorch	b7038f7c37e955f7400459bbfc9382a77b16377d	Python	Change exception to a warning	loss of precision	exception		This test script compares if two values are “close enough” and handles +inf, -inf, nan	numerical differences raise exception	precision tests/speed benchmarks	accuracy testing	tensor compare, testing accuracy, JIT	relax accuracy test tolerance	relax accuracy test tolerance	changes errors to warnings when numerical differences found by replacing self.assertRaisesRegex with assertWarnsRegex
231	PyTorch	032e4f81a8df14fe8b7177957f73567fa04919e8	Python	Unit test	overflow				Test for overflow does not verify that all listed conditions throw, just the first one	precision tests/speed benchmarks	overflow test	testing	fix test/warning	fix overflow check	Update test to check that the correct exceptions are raised when attempting to convert and invalid value to a certain type. Refactor code: add 'with' and 'assert' for every condition.
232	PyTorch	86abc8cd481bfa2b9bb741722770796966778ab1	C++	Change exception to a warning	overflow			PyTorch has a JIT compiler and a method to allow for inserting instructions as the compiler is compiling on the go. In this case an overflow check is inserted.		other	other	C++ interpreter	fix test/warning	change variable type, change exception to a warning	Change an exception to instead just raise a non fatal warning, also, this changes a cast to use unsigned variants of 16 and 64 bit integers, which both allows double the amount of positive values these types can represent, as well as giving them well defined overflow behavior, which avoids undefined behavior in the event that they do overflow, and will simply wrap arround.	throw std::runtime_error("safe_narrow_cast<>() failed due to overflow"); safe_narrow_cast<int16_t, int64_t>(N));	TORCH_WARN( + "ATTENTION: your model computation is overflowing, safe_narrow_cast<>() failed"); + return v; safe_narrow_cast<uint16_t, uint64_t>(N));
233	PyTorch	2171f910531be28f7d5dd8e6ab8bff3a5486e6fd	Python	Unit test	overflow			ROCm is the first open-source software development platform for HPC/Hyperscale-class GPU computing	The test was previously turned off because of broken continuous integration on ROCm	precision tests/speed benchmarks	overflow test	testing overflow, Cuda	add overflow check	add overflow check	reenable cuda_kernel_loop_overflow_large test		def test_cuda_kernel_loop_overflow_large(self): # Make sure input.numel() > INT_MAX is handled: x = torch.randn(1, 1, 1, 2**31, dtype=torch.float16, device="cuda")
234	PyTorch	916eee182c9dc8d335501f6672842c6d29f0af58	Python	Unit test	overflow				A test that checks input shape of 2D convolution prints overflowed integers. Bug in error message: RuntimeError: Expected 4-dimensional input for 4-dimensional weight 6 1 5 5 2323362894317625376, but got 5-dimensional input of size [1, 10, 1, 28, 28] instead Correct error message: RuntimeError: Expected 4-dimensional input for 4-dimensional weight 6 1 5 5, but got 5-dimensional input of size [1, 10, 1, 28, 28] instead	CNN operations	convolution	2D convolution	fix test/warning	correct error message	add unit test to test shape mismatch for 2d convolutions		def test_mismatch_shape_conv2d(self): + x = torch.randn(1, 10, 1, 28, 28) + w = torch.randn(6, 1, 5, 5) + + with self.assertRaisesRegex(RuntimeError, + r'Expected 4-dimensional input for 4-dimensional weight 6 1 5 5,' + + r' but got 5-dimensional input of size \[1, 10, 1, 28, 28\] instead'): + + F.conv2d(x, w)
235	PyTorch	3805be62c1bb10b8bf4e645aac30d89efd8f79ab	Python	Unit test	overflow				quantization test fails due to overflow when width parameter is specified	quantization	quantization	quantization, testing	increase variable precision/change variable type	increase variable precision	skip test and get rid of width parameter. Note: no longer in pytorch	@given(A=hu.tensor(shapes=((3, 4, 5),), qparams=hu.qparams()), - b=st.floats(allow_infinity=False, allow_nan=False, width=32))	@unittest.skip("FIXME: Failing due to overflow error without width option") @given(A=hu.tensor(shapes=((3, 4, 5),), qparams=hu.qparams()), + b=st.floats(allow_infinity=False, allow_nan=False))
236	PyTorch	1ed488da4f88ec7b85ba5f6a4113908dda3681e3	Python	Unit test	loss of precision					non-standard precision	non-standard precision	custom precision testing	fix test/warning	fix precision test	fix precision test for inplace mode	for inplace in (True, False): - if len(decl) == 3: - name, constr, arg_constr = decl - desc = '' - elif len(decl) == 4: - name, constr, arg_constr, desc = decl if inplace: name = name + '_' if not hasattr(tensor, name): @@ -335,8 +337,6 @@ for decl in tests: if desc: test_name += '_' + desc - precision = custom_precision.get(name, TestCuda.precision)	for t in types: tensor = t() gpu_tensor = get_gpu_type(t)() + if len(decl) == 3: + name, constr, arg_constr = decl + desc = '' + elif len(decl) == 4: + name, constr, arg_constr, desc = decl + + precision = custom_precision.get(name, TestCuda.precision)
237	Tensorflow/Keras	37af1b8790d633b9002ab04a0e664ca3c1dbe508	Python	fix	loss of precision					data processing	batch normalization	batch normalization	rewrite math formula	rewrite math formula	Do not use moving average in batch normalization since the method moments that calculate the mean of input that is utilized already implements this ogic in a numerically stable way
238	Tensorflow/Keras	f93960d0afdcf59457b614158ee5575ca2acfe15	Python	fix	N/A				incorrect comment about numerical stability	statistical distributions	statistical distributions	Beta distribution	fix test/warning	delete incorrect comment
239	PyTorch	8c8918c3412aa1a7a50df02cddfd66be948d2ace	C++	Fix	overflow					non-standard precision	non-standard precision	half precision, overflow testing	fix test/warning	fix overflow check	make half precision overflow checks consistent with other types		template<> bool overflows<Half, double>(double f) { + using limit = std::numeric_limits<double>; + if (limit::has_infinity && std::isinf(f)) { + return false; + }
240	PyTorch	79c3ebc040c4bac896477030d8af4ac94bc6f440	Python	Unit test	loss of precision				Unit test was not aware of the precision of inputs	activation functions	activation functions	testing	fix test/warning	fix overflow check	Add argument to make assertion aware of precision of inputs.
241	PyTorch	2b902e9738f5346050814b40db3ec67faf37128a	C++	fix	loss of precision			An offset within an array or other data structure object is an integer indicating the distance (displacement) between the beginning of the object and a given element or point, presumably within the same object. The concept of a distance is valid only if all elements of the object are of the same size (typically given in bytes or words). For example, in A as an array of characters containing "abcdef", the fourth element containing the character 'd' has an offset of three from the start of A. In assembly language an offset usually denotes the number of address locations added to a base address in order to get to a specific absolute address.	offset numerical bug when casting	quantization	quantization	quantization, caffe2, type conversion	increase variable precision/change variable type	change variable type	change all_offsets variable type from float to int32_t	std::vector<std::vector<float>>* all_offsets)	std::vector<std::vector<int32_t>>* all_offsets)
242	PyTorch	5292685d2f144d9781ab8b7991c0a1153098a477	C++	Fix	loss of precision	-inf, NaN		Logarithms of determinants of large positive definite matrices appear ubiquitously in ML. Log-determinant computation involves the Cholesky decomposition	loss of precision when diagonal matrix contains small values. log determinant of a square matrix causes -inf when the matrix entries are very small numbers. Result is -inf if input has zero log determinant. If input has negative determinant, the result is NaN	linear algebra	linear algebra	linear algebra, log of matrix determinant	use a different algorithm	use a different algorithm	Use sign of diagonal of U instead of the matrix determinant when diag_U has a lot small values.					determinant of a matrix, log of a matrix
243	PyTorch	67f2039f4ce233754910ebc24fbfcc8bc68685ae	Python	Fix	inefficient algorithm	slow execution		The binomial distribution is used when there are exactly two mutually exclusive outcomes of a trial, e.g., a coin toss has only two outcomes: heads and tails. A single binary outcome has a Bernoulli distribution, and a sequence of binary outcomes has a Binomial distribution.. The binomial distribution gives the discrete probability distribution P_p(n\|N) of obtaining exactly n successes out of N Bernoulli trials (where the result of each Bernoulli trial is true with probability p and false with probability q=1-p).	Log probability in binomial distribution has numerical stability issues. issue manifests itself when `total_count` is high and `probs` is very low. step size unreasonably small	statistical distributions	statistical distributions	distributions, log probability, binomial distribution	rewrite math formula	rewrite math formula	log probability method in binomial distribution is unstable	max_val = (-self.logits).clamp(min=0.0) value * self.logits + self.total_count * max_val - - self.total_count * torch.log1p((self.logits + 2 * max_val).exp()))	value * self.logits - self.total_count * torch.log1p(self.logits.exp()))	@unittest.skipIf(not TEST_NUMPY, "NumPy not found") def test_binomial_log_prob_float(self): probs = torch.tensor([1e-5, 0.99999], dtype=torch.float) total_count = 1000000. x = torch.tensor([10, 9999], dtype=torch.float) expected = scipy.stats.binom(total_count, probs.numpy()).logpmf(x.numpy()) log_prob = Binomial(total_count, probs).log_prob(x) # Comparison is again scipy distributions which use float64. self.assertTrue(np.allclose(log_prob, expected, rtol=0.05)) logits = probs_to_logits(probs, is_binary=True) log_prob = Binomial(total_count, logits=logits).log_prob(x) self.assertTrue(np.allclose(log_prob, expected, rtol=0.05))
244	PyTorch	a17c0118a52d34c97ab48bae416ae1896ad14e56	C++	Fix	overflow	NaN loss	overflow	Binary Cross Entropy (BCE) is a loss function used for binary classification taks to measure the difference between true labels and predicted labels. BCE with logits takes logits, not predicted labels as input, but serves the same purpose.	Binary cross entropy with logits is unstable with positive weights argument when logits are large negative values and results in an inf. Positive weight is a weight of positive examples and must be a vector with length equal to the number of classes.	loss functions	loss functions	binary cross entropy loss	rewrite math formula	rewrite math formula	instead of multipling by 1 + exp(-input), add exp(-input-max_val)	loss = (1 - target).mul_(input).add_(log_weight.mul_((-max_val).exp_().mul_(1 + (-input).exp_()).log_().add_(max_val)));	loss = (1 - target).mul_(input).add_(log_weight.mul_(((-max_val).exp_().add_((-input - max_val).exp_())).log_().add_(max_val)));	def test_bce_with_logits_stability(self): + output = torch.tensor([0., -120.]) + target = torch.tensor([0., 1.]) + pos_weight = torch.tensor([1., 1.]) + + out1 = nn.BCEWithLogitsLoss()(output, target) + self.assertTrue(torch.isfinite(out1).all().item()) + + out2 = nn.BCEWithLogitsLoss(pos_weight=pos_weight)(output, target) + self.assertTrue(torch.isfinite(out2).all().item())	H_p(q) = -1/N * sum_from_i_to_N(y_i * log(p(y_i)) + (1-y_i) * log (1-p(y_i))	log, multiply
245	PyTorch	00d2befba11a1e9c85146a4470721eb75596d5b7	Cuda	Fix	loss of precision			TH = TorcH This is in directory aten/src, which contains the low-level tensor libraries for PyTorch, as well as the new ATen C++ bindings. The low-level libraries trace their lineage from the original Torch. There are multiple variants of the library, summarized here: TH = TorcH THC = TorcH Cuda THCS = TorcH Cuda Sparse (now defunct) THCUNN = TorcH CUda Neural Network (see cunn) THNN = TorcH Neural Network (now defunct) THS = TorcH Sparse (now defunct)	unstable TorcH Cuda Tensor outer dimentions (THTensor_varOuterDim)	tensor math	tensor math	low level tensor math, variance calculation, GPU	increase variable precision/change variable type	change variable type	Use Accreal variable type instead of real			def test_var_stability(self): tensor = torch.FloatTensor([2281.5, 2281.25]).cuda() # Stability for inner dim self.assertEqual(tensor.var(0)[0], 0.03125) # General stability self.assertEqual(tensor.var(), 0.03125) # Stability for outer dimensions tensor = tensor.unsqueeze(1) self.assertEqual(tensor.var(0)[0], 0.03125)		variance
246	PyTorch	72a257584efa7fb63b14f09d19efc96caa5d6e4d	Cuda	Fix	overflow/underflow		overflow/underflow	Log sigmoid is a logistic non-linear activation function. However, typically softmax is prefered over sigmoid	numerically unstable logsigmoid	activation functions	activation functions	log sigmoid	rewrite math formula	rewrite math formula	rewrite formula for log sigmoid considering the maximum representable values	const float fmax = (float)((int32_t)(uint32_t)qmax - (int32_t)(uint32_t)zero_point); const T z = THCNumerics<T>::exp(- input); - gradInput = gradOutput z / (1.f + z);	const T max = fmaxType(0.f, -input); + const T z = THCNumerics<T>::exp(-max) + THCNumerics<T>::exp(-input -max); + T max_deriv = 0.f; + T sign = -1.f; + if (input < 0.f){ + max_deriv = -1.f; + sign = 1.f; + } + gradInput = gradOutput (-max_deriv - sign((z - 1.f)/z)); + gradInput = gradOutput (-max_deriv - sign*((z - 1.f)/z));		log(1/1+e^(-x))	log sigmoid
247	PyTorch	f555c6308c534dd3964d106f2551067fad6edaec	Cuda, C++	Fix	loss of precision			Normalized gradient helps to ameliorate issues with gradient descent such as slow convergenece and getting stuck in saddle points. Normalized gradient is the gradient divided by its magnitude. Therefore, when normalized gradient only provides the direction for gradient descent, but does not affect magnitude of step size. Gradient magnitude is calculated as the square root of sum of squares of the gradient vector.	normalization operation for gradient is unstable due to sum of squares operation	gradients/derivatives	gradients	gradient normalization	rewrite math formula	rewrite math formula	rewrite math formula	in Cuda: grad_mat[index] = (y_ij / x_ij) * (dy_ij - y_ij) * row_sum; in C++: gradInMat = ((outputMat / inputMat) * (gradOutMat - outputMat)).rowwise() * (gradOutMat * inputMat).colwise().sum();	in Cuda: grad_mat[index] = (dy_ij / row_norm) - ((x_ij / row_norm_3) * row_sum); in C++: auto square = inputMat.square(); auto norm = square.colwise().sum().sqrt(); gradInMat = gradOutMat.rowwise() * norm.inverse() - ((inputMat.rowwise() / norm.pow(3)).rowwise() * (gradOutMat * inputMat).colwise().sum());		gradient/\|\|gradient\|\|	sum of squares, square root
248	PyTorch	9a153412fd4f78b9a9b59bbf85a358339fb69613	C++, Python, Cuda	Fix	underflow			Rsample offers a reparametrization trick, where the parameterized random variable can be constructed via a parameterized deterministic function of a parameter-free random variable. The reparameterized sample therefore becomes differentiable. sample and rsample both generate samples from the distribution, but only rsample supports differentiating through the sampler. You should use rsample whenever you need to compute gradients of distribution parameters with respect to functions of samples, e.g. in variational inference. SOURCE: https://forum.pyro.ai/t/sample-vs-rsample/2344 sample is literally rsample wrapped in with torch.no_grad(), so when you don't need gradients. SOURCE: https://github.com/cornellius-gp/gpytorch/issues/764	underflow issue in method rsample of dirichlet distribution class	statistical distributions	statistical distributions	dirichlet distribution, sampling, forward pass	use a different algorithm	use a different algorithm	adds a `torch._sample_dirichlet` method in `Distributions.cpp`			def test_beta_underflow(self): + # For low values of (alpha, beta), the gamma samples can underflow + # with float32 and result in a spurious mode at 0.5. To prevent this, + # torch._sample_dirichlet works with double precision for intermediate + # calculations. + set_rng_seed(1) + num_samples = 50000 + for dtype in [torch.float, torch.double]: + conc = torch.tensor(1e-2, dtype=dtype) + beta_samples = Beta(conc, conc).sample([num_samples]) + self.assertEqual((beta_samples == 0).sum(), 0) + self.assertEqual((beta_samples == 1).sum(), 0) + # assert support is concentrated around 0 and 1 + frac_zeros = float((beta_samples < 0.1).sum()) / num_samples + frac_ones = float((beta_samples > 0.9).sum()) / num_samples + self.assertEqual(frac_zeros, 0.5, 0.05) + self.assertEqual(frac_ones, 0.5, 0.05) + + @unittest.skipIf(not TEST_CUDA, "CUDA not found") + def test_beta_underflow_gpu(self): + set_rng_seed(1) + num_samples = 50000 + conc = torch.tensor(1e-2, dtype=torch.float64).cuda() + beta_samples = Beta(conc, conc).sample([num_samples]) + self.assertEqual((beta_samples == 0).sum(), 0) + self.assertEqual((beta_samples == 1).sum(), 0) + # assert support is concentrated around 0 and 1 + frac_zeros = float((beta_samples < 0.1).sum()) / num_samples + frac_ones = float((beta_samples > 0.9).sum()) / num_samples + # TODO: increase precision once imbalance on GPU is fixed. + self.assertEqual(frac_zeros, 0.5, 0.12) + self.assertEqual(frac_ones, 0.5, 0.12)
249	PyTorch	74819087de17de4c8215a7f631d8d4d18dd13d45	C++	Fix	inefficient algorithm				Mixed precision training with DDP (distributed data parallelization) randomly hangs. The reason for that is that take_tensors will generate a list of bucketed tensors in an undeterministic order, because the key to the map is a pointer.	non-standard precision	non-standard precision	distributed data parallelization, mixed precision	use a different algorithm	use a different algorithm	use map instead of unordered map to generate an ordered list of bucketed tensors for parallel training	std::unordered_map<at::Type*, TensorGroup> groups;	std::map<TypeID, TensorGroup> groups;
250	PyTorch	73bdb661feb195a8b98366db5750b998c025f709	Python	Unit test	loss of precision				BCELoss's outputs and gradInput computations are accurate to around 1e-6 on float types (as a relative value, not absolute), which is reasonable. However, the tests use absolute thresholds: the accumulation of 5 gradInputs has to have error less than 0.0002.	loss functions	loss functions	binary cross entropy loss, testing precision	rewrite math formula	rewrite math formula	restrict input to [0.028, 1- 0.028] instead of [0.02, 1- 0.02] to decrease error The worse case for BCELoss's gradInput for each element may be described as 1 / ( (1-x) * x ). Previously, the input to the test was restricted to [0.02, 1- 0.02], resulting in worse-case largest gradInput of 50, resulting in a total accumulated grad of 505 = 250, resulting in an error of 250 1e-6 = 0.00025, which was too big. By restricting x to [0.028, 1- 0.028] we get a worse case of 36.74, resulting in a total accumulated grad of 184, which is less than the 200 needed to have error less than 0.0002.	input_fn=lambda: torch.rand(15, 10).clamp_(2e-2, 1 - 2e-2)	input_fn=lambda: torch.rand(15, 10).clamp_(2.8e-2, 1 - 2.8e-2),
251	PyTorch	912ee4e40a9f2f2f156e94a76a521d3ed4f49bd0	Python	Unit test	loss of precision				failing unit test	linear algebra	linear algebra	testing, sparse to dense matrix conversion	rewrite math formula	rewrite math formula		elements=st.floats(min_value=0.5, max_value=10), dtype=dt)) D = np.random.uniform(0, 1, size=(first_dim,) + X.shape[1:])	elements=st.floats(min_value=0, max_value=1), dtype=dt)) D = np.zeros((first_dim,) + X.shape[1:])
252	PyTorch	b1fa9d2b06714de099e3ae1141d15dcbaba78dd3	C	Fix	overflow			THFile is for loading data from disk or memory, but this is no longer part of PyTorch		data processing	data	data loading	increase variable precision/change variable type	increase variable precision, add overflow check	increase precision to long, add logic to check that smaller than long max