Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faiss assertion 'err == CUBLAS_STATUS_SUCCESS' failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<float, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, #3520

Open
lcg0808 opened this issue Jun 17, 2024 · 3 comments

Comments

@lcg0808
Copy link

lcg0808 commented Jun 17, 2024

Summary

OS:

Faiss version: faiss-gpu 1.6.3; cuda 11.3 python 3.7

Running on:
GPU 8 A100

Interface:
Python

I tried to run the following code. When I use small-scale data, such as 1M samples, it can be executed normally. However, if I use large-scale data, such as 10M, an error will be reported as follows:Faiss assertion 'err == CUBLAS_STATUS_SUCCESS' failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<float, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<IndexType, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with AT = float; BT = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at ./faiss/gpu/utils/MatrixMult-inl.cuh:133; details: cublas failed (13): (512, 128) x (262144, 128)' = (512, 262144)

code:
def train_kmeans(x, k, ngpu=8):
#x:embeddings,like 1000000*128; k cluster_num, like 10000
d = x.shape[1]
clus = faiss.Clustering(d, k)
clus.verbose = True
clus.niter = 20

clus.max_points_per_centroid = 10000000
res = [faiss.StandardGpuResources() for i in range(ngpu)]
flat_config = []
for i in range(ngpu):
    cfg = faiss.GpuIndexFlatConfig()
    cfg.useFloat16 = False # False
    cfg.device = i
    flat_config.append(cfg)

if ngpu == 1:
    index = faiss.GpuIndexFlatIP(res[0], d, flat_config[0])
else:
    indexes = [faiss.GpuIndexFlatIP(res[i], d, flat_config[i])
               for i in range(ngpu)]
    index = faiss.IndexReplicas()
    for sub_index in indexes:
        index.addIndex(sub_index)

# perform the training
clus.train(x, index)
centroids = faiss.vector_float_to_array(clus.centroids)

# obj = faiss.vector_float_to_array(clus.obj)
# print("final objective: %.4g" % obj[-1])

return centroids.reshape(k, d)
@mdouze mdouze added the GPU label Jun 17, 2024
@mdouze
Copy link
Contributor

mdouze commented Jun 17, 2024

Could you try this with a recent of Faiss (with cuda 12)?

@PrithivirajDamodaran
Copy link

With faiss-gpu 1.7.2 and cuda 12 this is still a persistent bug. is there any progress on this. please advice.

@huangxixiyiqi
Copy link

faiss-gpu 1.7.2 and cuda 12 , I encountered this bug。

Faiss assertion 'err == CUBLAS_STATUS_SUCCESS' failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<float, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<IndexType, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with AT = float; BT = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at /project/faiss/faiss/gpu/utils/MatrixMult-inl.cuh:265; details: cublas failed (13): (512, 512) x (512, 512)' = (512, 512) gemm params m 512 n 512 k 512 trA T trB N lda 512 ldb 512 ldc 512

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants